I have 2 dataframes with different no. of rows and different column names. I want to compare and get the matching rows specific to that columns as output.
e.g
df1 = pd.DataFrame({'foo': [11, 22, 33], 'bar': ['aa', 'ab', 'ac'], 'foobar': [111, 222, 333]})
df2 = pd.DataFrame({'AA': [1,22], 'BB': ['see','ab'], 'CC': [123,222]})
df1: foo bar foobar
0 11 aa 111
1 22 ab 222
2 33 ac 333
df2: AA BB CC
0 1 see 123
1 22 ab 222
df2 not necessarily has to have same no of rows and columns.
expected output: for matching rows of df2 in df1
df3:
foo bar foobar
1 22 ab 222
I have tried using np.all, but this seems to work only if we have same no. of rows or single row in df2.
df3 = df1.loc[np.all(df1[['bar','foobar']].values == df2[['BB','CC']].values, axis=1),:]
Essentially needed, difference rows or matching rows from any of the df1 or df2.
expected output: for unmatched rows of df1 from df2
df3:
foo bar foobar
0 11 aa 111
2 33 ac 333
Imagine in this case: The order of columns are different, column mapping I will do. example: ( if columns values of a,b,c of df1 == column values of d,e,f in df2) get me the matched rows form df1 or df2.
df1 = pd.DataFrame({'foo': [11, 22, 33], 'bar': ['aa', 'ab', 'ac'], 'foobar': [111, 222, 333], 'barfoo':[2,22,34]})
df2 = pd.DataFrame({'AA': [22,33], 'CC': [222,333], 'BB': ['ab','ac']})
output : In this case I am matching on (foo:AA, bar:BB, foobar:CC)
df3:
foo bar foobar barfoo
1 22 ab 222 22
2 33 ac 333 34
Appreciate and thanks.