1

I have this problem, where there are two pandas dataframes which both has inaccurate value

df1 = pd.DataFrame({'F_Name': ['Alice', 'Bob', 'Charlie', 'Dave'], 'L_Name': ['Smith', 'Sharma', 'Puth', 'Bautista'], 'Age':[15, 24, 32, 40]})
df2 = pd.DataFrame({'F_Name': ['Charlie', 'David', 'Eve', 'Alice'], 'L_Name': ['Puth', 'Bautista', 'Angeline', 'Wonderland'], 'Age':[32,19,21,16]})

How can I do an outer merge for both of the dataframe and identify the value of that did not merged with each other

I have try using normal outer join

df_merge = pd.merge(df1, df2, how='outer' , on =['F_Name', 'L_Name', 'Age']

but it will only combine the data frame into single dataframe

The dataframe that I am expecting after merging should be something like this:

enter image description here

Any Idea how to achieve this? Thanks

Ronathan
  • 26
  • 2
  • Using `indicator` can help: `df1.merge(df2, on=['F_Name'], how='outer', suffixes=['_x', '_y'], indicator=True)` – ali bakhtiari Jan 06 '23 at 08:28
  • 1
    This is not really a merge as the keys are not identical, you need a fuzzy merge – mozway Jan 06 '23 at 18:33
  • Does this answer your question? [Pandas fast fuzzy match](https://stackoverflow.com/questions/69170278/pandas-fast-fuzzy-match) – Laurent Jan 08 '23 at 06:49

0 Answers0