-1

I have two dataframes that I want to compare, but only want to use the values that are not in both dataframes.

Example:

DF1:

     A    B    C
0    1    2    3
1    4    5    6

DF2:

     A    B    C
0    1    2    3
1    4    5    6
2    7    8    9
3    10   11   12

So, from this example I want to work with row index 2 and 3 ([7, 8, 9] and [10, 11, 12]).

The code I currently have (only remove duplicates) below.

df = pd.concat([di_old, di_new])
df = df.reset_index(drop=True)
df_gpby = df.groupby(list(df.columns))
idx = [x[0] for x in df_gpby.groups.values() if len(x) == 1]
print(df.reindex(idx))
ben
  • 81
  • 8
  • 1
    Does this answer your question? [pandas get rows which are NOT in other dataframe](https://stackoverflow.com/questions/28901683/pandas-get-rows-which-are-not-in-other-dataframe) – saiden Dec 09 '21 at 13:46
  • Thank you @saiden. This worked. I had some issues with my dataframe containing a date, but excluded it and it worked. – ben Dec 13 '21 at 09:07

1 Answers1

0

I would do :

df_n = df2[df2.isin(df1).all(axis=1)]

ouput

    A   B   C
0   1   2   3
1   4   5   6    
Tbaki
  • 1,013
  • 7
  • 12
  • 1
    Thanks for your response. When attempting this, it returned an empty dataframe. Found the solution in the link that was shared by @saiden – ben Dec 13 '21 at 09:09