How to align Dataframes in Pandas using substring matching?

Asked May 02 '23 at 12:52

Active May 02 '23 at 12:52

Viewed 13 times

I have two columns from two pandas dataframes with measurement data that I'd like to align by substring matching. More precisely if the string in df1[col1] is a substring of df2[col2] i want to merge the tables.

My very naive approach:

for string in df1['col1']:
  df1['match'] = df2['col2'].str.contains(string, regex=False, case=False)

How can i approach the task in a more efficient way?

I found this post about filtering by substring criteria that uses an or-chained regex pattern to filter, but if I'm not mistaken this does not yield the information that I would need to align the columns:

df.apply(lambda col: col.str.contains('foo|bar', na=False), axis=1)

asked May 02 '23 at 12:52

user3000

Please provide a reproducible example of your two dataframes and the matching expected output for clarity – mozway May 02 '23 at 12:54

How to align Dataframes in Pandas using substring matching?

0 Answers0