I have two dataframe df1 and df2. df2 consist of "tagname" and "value" column. Dictionary "bucket_dict" holds the data from df2.
bucket_dict = dict(zip(df2.tagname,df2.value))
In a df1 there are millions of row.3 columns are there "apptag","comments" and "Type" in df1. I want to match between this two dataframes like, if
"dictionary key" from bucket_dict contains in df1["apptag"] then update the value of df1["comments"] = corresponding dictionary key and df1["Type"] = corresponding bucket_dict["key name"] . I used below code:
for each_tag in bucket_dict:
df1.loc[(df1["apptag"].str.match(each_tag, case = False ,na = False)), "comments"] = each_tag
df1.loc[(df1["apptag"].str.match(each_tag, case = False ,na = False)), "Type"] = bucket_dict[each_tag]
Is there any efficient way to do this since it's taking longer time.
Bucketing df from which dictionary has been created:
bucketing_df = pd.DataFrame([["pen", "study"], ["pencil", "study"], ["ersr","study"],["rice","grocery"],["wht","grocery"]], columns=['tagname', 'value'])
other dataframe:
output_df = pd.DataFrame([["test123-pen", "pen"," "], ["test234-pencil", "pencil"," "], ["test234-rice","rice", " "], columns=['apptag', 'comments','type'])