0

I have the following (simplified) data;

import pandas as pd

a = [['10', '12345', '4'], ['15', '78910', '3'], ['8', '23456', '10']]
b = [['10', '12345'], ['15', '78910'], ['9', '23456']]

df_a = pd.DataFrame(a, columns=['id', 'sku', 'quantity '])
df_b = pd.DataFrame(b, columns =['id','sku'])

I need to compare the 'id and 'sku' columns from both dataframes and for those that match I need to update df_a['quantity']to equal '0'.

So, something like an if statement?

if (df_a['id'] == df_b['id']) and (df_a['sku'] == df_b['sku']):
    df_a['quantity']=0
FunnyChef
  • 1,880
  • 3
  • 18
  • 30

3 Answers3

3

This should do it

df_a.loc[(df_b['id'] == df_a['id']) & (df_a['sku'] == df_b['sku']), 'quantity '] = 0
Sachit Nagpal
  • 486
  • 4
  • 7
  • That would do it except my dataframes are different shapes. Which throws: "ValueError: Can only compare identically-labeled Series objects" So, how can this be done with dataframes that have different shapes? – FunnyChef Oct 24 '17 at 18:40
2

Not the most elegant way, but will do the trick if dataframes have different shapes.

a_id_sku = df_a.id + df_a.sku
b_id_sku = df_b.id + df_b.sku

df_a.loc[a_id_sku.isin(b_id_sku), 'quantity '] = 0

Let me know if this worked

Sachit Nagpal
  • 486
  • 4
  • 7
2

Another approach using pandas merge:

df_a.loc[pd.merge(df_a, df_b, on = ['id', 'sku'] , how='left',
    indicator=True)['_merge'] == 'both', 'quantity'] = 0

df_a
    id  sku quantity
0   10  12345   0
1   15  78910   0
2   8   23456   10
Quickbeam2k1
  • 5,287
  • 2
  • 26
  • 42