0

I have two pandas DFs the same column, 'name'. I want to merge the two columns but there are a lot of differences in the formatting of the 'name' column.

** For reference -- the 'name' column stores the names of public companies **

Some of the differences are due to punctuation, or capitalization, or the company's short name vs its long name. For example, 'pepsi co.' in df1 may be 'pepsi co', 'pepsi cola holding company' or 'pepsico, inc.' in df2.

So what I need is to merge the two dataframes into one but let pandas know to ignore these differences. Otherwise, only around 10% of the datasets will match up.

Any ideas of what to do? Thank you :)

kylala
  • 5
  • 4

0 Answers0