I have two dataframes one with a column filled with values of which some values are a product_ids and other values are other information which I have to keep. I have another dataframe with product_id and additional information on these products.
Now I'd like to merge the two dataframes on the product_id and in cases where I don't have a product id i'd like to just fill it up with NaN's. So I basically want to enrich one dataframe with data from the other dataframe. My product ids are strings and I can't change them to ints since the rest of the values in the column need to be strings.
I have tried several things. I have tried to write a function which checked whether the value was a digit and if so, would get the information from the other dataframe. Below the code and a rough sketch of what the data looked like.
def get_additional_info(case_table, product_info):
for page in case_table['page_name']:
if re.match('\d{6,}', page):
return product_info[product_info['Key']==page]
page_name timestamp some_columm some_other_column
202020340
200304020
text
202503050
3045060
text2
key info_on_product
202020340
200304020
202503050
3045060
It however only returned an empty dataframe. When I tested it with specific product ids (in this case called page_name) I did get results, it just didn't seem to work in the function.
I have also tried a similar method but then with the apply approach. It however didn't work because I couldn't figure out how to give it two arguments. I have also tried an approach using pandasql, which also didn't seem to work.