Create a new column with the number of unique occurances by date (groupby 2 conditions - in a timeseries)

Question

I've tried various group by methods I would like to add a new column 'product-locations' which calculates the total number of 'store locations' for a specific 'product code' on a given date. Basically, how many total number stores is a specific product selling in on any given day. My dataframe should look like this, with 'store-locations' added as a new column.

date	store_location	product_code	store_locations
2017-01-01	Store-A	100	3
2017-01-01	Store-B	100	3
2017-01-01	Store-C	100	3
2017-01-01	Store-D	200	1
2017-01-02	Store-D	200	1

The following for example ignores grouping by date and only takes into account the number if unique products:

group = df.groupby(['date','store_location','product_code']).size().groupby(level=2).size()

score 1 · Answer 1 · answered Feb 17 '23 at 12:12

you can use:

pvt_coef = df.pivot_table(index=['date','product_code'], aggfunc={'store_location': np.count_nonzero})
pvt_coef.rename(columns={'store_location':'count'}, inplace=True)
pvt_coef.reset_index()
dfcoef = pd.merge(df, pvt_coef, left_on=['date','product_code'], right_on = ['date','product_code'], how='left')

Create a new column with the number of unique occurances by date (groupby 2 conditions - in a timeseries)

1 Answers1