How do i find the most efficient way to check what rows differ in a pandas dataframe?
Imagine we have the following pandas dataframes, df1 and df2.
df1 = pd.DataFrame([[a,b],[c,d],[e,f]], columns=['First', 'Last'])
df2 = pd.DataFrame([[a,b],[e,f],[g,h]], columns=['First', 'Last'])
In this case, row index 0 of df1 would be [a,b]; row index 1 of df1 would be [c,d] etc
I want to know what is the most efficient way to find what rows these dataframes differ.
In particular, although [e,f] appears in both dataframes - in df1 it is at index 2 and in df2 it is in index 1, I would want my outcome to show this.
something like diff(df1,df2) = [1,2]
I know I could loop through all the rows and check if df1.loc[i,:] == df2.loc[i,:] for i in range(len(df1)) but is there a more efficient way?