0

I have two dataframes df1 and df2. df1 has 10 columns, where column 0 includes the original image names and the remaining columns including their features and the target variable. df2 has one column that include the augmented image names. I want to pick the values in each row of df1 and assign them to each row in df2 if the original image name in column 0 in df1 is a substring of the augmented image name in column 0 of df2, assuming that the original image name is '29703_left', while the augmented image name is: '/29703_left.jpg_0_722.jpeg'

How can I do that?

Thank you

  • It'd help if you posted some example data and the output you want. See [How to make good reproducible pandas examples](/q/20109391/4518341). You can [edit]. BTW, if you want more tips, check out [ask]. – wjandrea May 09 '22 at 22:10

1 Answers1

0

If the image names are all like that, you could use regex to extract the original from the augmented, then merge.

>>> df2['original'] = df2['augmented'].str.extract(r'/(.*?)\.')
>>> df2.merge(df1, how='left', on='original')
                    augmented    original features target
0  /29703_left.jpg_0_722.jpeg  29703_left   [1, 2]      x

(Here features and target are some dummy data I added.)

wjandrea
  • 28,235
  • 9
  • 60
  • 81