1

I am working on pandas project where i have to compare two dataframes , just picking up one row from D1 and comparing it with all the rows of D2 (D2 rows are at different place) and return true or false, bellow are my dataframes

DF1

Animals    planets
dog         Earth 
dragon      Mars
cat         Pluto

DF2

Animals    planets
cat         Pluto
dog         Earth 

I am pretty new to pandas , I tried normal compare but its throwing me an error (Can only compare identically-labeled DataFrame objects)

df1.compare(df2)
Divakar R
  • 773
  • 1
  • 8
  • 36
  • I think what you are looking for is here: https://stackoverflow.com/questions/20225110/comparing-two-dataframes-and-getting-the-differences/20228113#20228113 – Joe Ferndz Nov 11 '20 at 02:57

2 Answers2

0

you can only compare dataframes with identical indices.

df.compare() will then show you the differences.

make sure that df1.index == df2.index which we can't see in your example.

sudonym
  • 3,788
  • 4
  • 36
  • 61
0

What you are looking for is in this thread.... Comparing two dataframes and getting the differences

I have flagged this post as duplicate. However, for reference, here's the answer.

import pandas as pd

c = ['Animals','planets']
d = [['dog','Earth'], 
['dragon','Mars'],
['cat','Pluto']]

df1 = pd.DataFrame(d,columns = c)
print (df1)

c = ['Animals','planets']
d = [['cat','Pluto'], 
['dog','Earth']]

df2 = pd.DataFrame(d,columns = c)
print (df2)

df = pd.concat([df1, df2])
df = df.reset_index(drop=True)
df_gpby = df.groupby(list(df.columns))
idx = [x[0] for x in df_gpby.groups.values() if len(x) == 1]
df = df.reindex(idx)
print(df)

df1:

  Animals planets
0     dog   Earth
1  dragon    Mars
2     cat   Pluto

df2:

  Animals planets
0     cat   Pluto
1     dog   Earth

df: (difference after comparison)

  Animals planets
1  dragon    Mars
Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33