I would like to compute the mean per ID using groupby
and mean
. However, I only need the rows where Date
is between year 2016-01-01
and 2017-12-31
.
d = {'ID': ['STCK123', 'STCK123', 'STCK123'], 'Amount': [250, 400, 350],
'Date': ['2016-01-20', '2017-09-25', '2018-05-15']}
data = pd.DataFrame(data=d)
data = data[['ID', 'Amount', 'Date']]
data['Date'] = pd.to_datetime(data['Date'])
This gives following df:
ID Amount Date
STCK123 250 2016-01-20
STCK123 400 2017-09-25
STCK123 350 2018-05-15
When I use:
data.groupby(['ID'])['Amount'].agg('mean')
It takes all rows into account, resulting in a mean value of 333.3
. How can I exclude the rows where Date
is 2018 (yielding a mean value of (250+400)/2=325
)?