I have a dataframe like below:
Date Group Value Duration
2018-01-01 A 20 30
2018-02-01 A 10 60
2018-01-01 B 15 180
2018-02-01 B 30 210
2018-03-01 B 25 238
2018-01-01 C 10 235
In the above df
, we see that for group A last date is 2018-02-01
. And I want to insert a row with Date
as 2018-03-01
and duplicate the values of the previous row. So in other words, for each group, I want equal number of row count by inserting a duplicating row till the Date
reaches 2018-03-01
.
My resultant Dataframe would look like
Date Group Value Duration
2018-01-01 A 20 30
2018-02-01 A 10 60
2018-03-01 A 10 60 <----New Row
2018-01-01 B 15 180
2018-02-01 B 30 210
2018-03-01 B 25 238
2018-01-01 C 10 235
2018-02-01 C 10 235 <----New Row
2018-03-01 C 10 235 <----New Row
N.B.: We can have 0
values for new row instead of duplication from last existing row.
I am trying to achieve the above by the following codes but not able to progress further:
max_d = df.loc[df.groupby(['Group'])['Date'].idxmax()]
for k,v in df.groupby(['Group']):
if k['Date'].max() < max_d['Date'].max():
last_row = 0 <----or k.iloc[-1]
last_row['Date'] = pd.DateOffset(month=1)
k.append(last_row)
But I am getting error as:
Traceback (most recent call last):
File "<ipython-input-6-44d0a094516e>", line 2, in <module>
if k['Date'].max() < max_d['Date'].max():
TypeError: 'int' object is not subscriptable
Am I missing out anything in the above code? Any clue will be appreciated.