1

I have two Data Frames:

prev_df:

       Time       FO_SYMBOL  TOTAL_VOLUME
0  14:20:41             ACC        6778.0
1  14:56:57        ADANIENT        4314.0
2  09:19:12      AUROPHARMA        1295.0
3  15:09:14      BAJAJ-AUTO        8339.0
4  09:19:12         HCLTECH        1431.0
5  09:19:12      HEROMOTOCO        1551.0
6  13:53:02      ULTRACEMCO        8284.0

df:

       Time       FO_SYMBOL  TOTAL_VOLUME
0  14:20:41             ACC        6778.0
1  14:56:57        ADANIENT        4314.0
2  09:19:12      AUROPHARMA        1295.0
3  15:09:14      BAJAJ-AUTO        8339.0
4  09:19:12         HCLTECH        1431.0
5  09:19:12      HEROMOTOCO        1551.0
6  13:53:02      ULTRACEMCO        8284.0
7  14:55:12      BHEL              8114.0 <<= NEW ROW
8  14:55:12      BHEL              8120.0 <<= NEW ROW

I want to compare both dataframe and find the new rows which are different. I want my output as below:

Result:

0  14:55:12      BHEL              8114.0 <<= NEW ROW
1  14:55:12      BHEL              8120.0 <<= NEW ROW

Currently I am using code as below:

indexes = (df != prev_df).any(axis=1)
new_df = df.loc[indexes]

But when new rows populated in df the I am getting error as:

Can only compare identically-labeled DataFrame objects

Please help.

2 Answers2

2

You can concat and drop_duplicates:

cols=prev_df.columns.intersection(df.columns).tolist()
pd.concat([df, pd.concat([prev_df]*2)]).drop_duplicates(cols, keep=False)

       Time FO_SYMBOL  TOTAL_VOLUME
7  14:55:12      BHEL        8114.0
8  14:55:12      BHEL        8120.0
anky
  • 74,114
  • 11
  • 41
  • 70
1

try this

df3 = pd.merge(df,prev_df,on='a',how='left',indicator=True)
df3[df3['_merge']=='left_only']
df3.drop(['_merge'],axis=1,inplace=True)

      Time FO_SYMBOL  TOTAL_VOLUME
7  14:55:12      BHEL        8114.0
8  14:55:12      BHEL        8120.0
tawab_shakeel
  • 3,701
  • 10
  • 26