I have a time series dataset which is basically consumption data of materials over the past 5 years
Material No Consumption Date Consumption
A 2019-06-01 1
A 2019-07-01 2
A 2019-08-01 3
A 2019-09-01 4
A 2019-10-01 0
A 2019-11-01 0
A 2019-12-01 0
A 2020-01-01 1
A 2020-02-01 2
A 2020-03-01 3
A 2020-04-01 0
A 2020-05-01 0
B 2019-06-01 0
B 2019-07-01 0
B 2019-08-01 0
B 2019-09-01 4
B 2019-10-01 0
B 2019-11-01 0
B 2019-12-01 0
B 2020-01-01 4
B 2020-02-01 2
B 2020-03-01 8
B 2020-04-01 0
B 2020-05-01 0
From the above dataframe, I want to see the number of months in which the material had at least 1 unit of consumption. The output dataframe should look something like this.
Material no_of_months(Jan2020-May2020) no_of_months(Jun2019-May2020)
A 3 7
B 3 4
Currently I'm sub-setting the data frame and using a group by to count the unique entries with non-zero consumption. However, this needs creating multiple data frames for different periods and then merging them. Was wondering if this could be done in a better way using dictionaries.
consumption_jan20_may20 = consumption.loc[consumption['Consumption Date']>='2020-01-01',['Material No','Consumption Date','Consumption']]
consumption_jan20_may20 = consumption_jan20_may20.groupby([pd.Grouper(key='Material No'),grouper])['Consumption'].count().reset_index()
consumption_jan20_may20 = consumption_jan20_may20.groupby('Material No').count().reset_index()
consumption_jan20_may20.columns = ['Material No','no_of_months(Jan2020-May2020)','dummy']
consumption_jan20_may20 = consumption_jan20_may20[['MATNR','no_of_months(Jan2020-May2020)']]