0

I have two dataframes. I want to update one dataframe with another with DATE and other columns as index. If the rows are not matching I want to insert that row with update function. How can I update/insert with update function ?

Here is my Ex: data

a = {'DATE': ['2022-01-01', '2022-01-02'],
          'CATEGORY': ['a', 'a'],
          'QUANTITY': [12, 14]}

b = {'DATE': ['2022-01-02', '2022-01-03', '2022-01-04'],
         'CATEGORY': ['a', 'a', 'a'],
         'QUANTITY': [0, 3, 9]}
a_df = pd.DataFrame(a)
b_df = pd.DataFrame(b)
a_df

DATE    CATEGORY    QUANTITY
0   2022-01-01  a   12
1   2022-01-02  a   14


b_df

DATE    CATEGORY    QUANTITY
0   2022-01-02  a   0
1   2022-01-03  a   3
2   2022-01-04  a   9

Result I am getting

b_df

DATE    CATEGORY    QUANTITY
0   2022-01-02  a   14.0
1   2022-01-03  a   3.0
2   2022-01-04  a   9.0

Opration Iam doing for the update is,

b_df.set_index(["DATE","CATEGORY"], inplace=True)
b_df.update(a_df.set_index(["DATE","CATEGORY"]))
b_df.reset_index(inplace=True)

Result I am getting

b_df

DATE    CATEGORY    QUANTITY
0   2022-01-02  a   14.0
1   2022-01-03  a   3.0
2   2022-01-04  a   9.0

I do want the first record from a_df in b_df dataframe after update. So the expected result is b_df

DATE    CATEGORY    QUANTITY
0   2022-01-01  a   12.0
1   2022-01-02  a   14.0
2   2022-01-03  a   3.0
3   2022-01-04  a   9.0

If the row in a_df doest exist in the b_df , I should insert that in the b_df, If exists I should update it :)

Ranjith
  • 3
  • 3

1 Answers1

0

If original both DataFrames has no duplicated columns DATE/CATEGORY is possible use:

df = pd.concat([a_df, b_df]).drop_duplicates(['DATE','CATEGORY'])
print (df)
         DATE CATEGORY  QUANTITY
0  2022-01-01        a        12
1  2022-01-02        a        14
1  2022-01-03        a         3
2  2022-01-04        a         9
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252