0

I've tried various group by methods I would like to add a new column 'product-locations' which calculates the total number of 'store locations' for a specific 'product code' on a given date. Basically, how many total number stores is a specific product selling in on any given day. My dataframe should look like this, with 'store-locations' added as a new column.

date store_location product_code store_locations
2017-01-01 Store-A 100 3
2017-01-01 Store-B 100 3
2017-01-01 Store-C 100 3
2017-01-01 Store-D 200 1
2017-01-02 Store-D 200 1

The following for example ignores grouping by date and only takes into account the number if unique products:

group = df.groupby(['date','store_location','product_code']).size().groupby(level=2).size()

Anentropic
  • 32,188
  • 12
  • 99
  • 147
jayboog321
  • 35
  • 4

1 Answers1

1

you can use:

pvt_coef = df.pivot_table(index=['date','product_code'], aggfunc={'store_location': np.count_nonzero})
pvt_coef.rename(columns={'store_location':'count'}, inplace=True)
pvt_coef.reset_index()
dfcoef = pd.merge(df, pvt_coef, left_on=['date','product_code'], right_on = ['date','product_code'], how='left')