0

I have a time-series in pandas with several products (id's: a, b, etc), but with monthly holes. I have to fill those holes. It may be with np.nan or any other constant. I tried groupby but I wasnt able.

date          id   units
2022-01-01    a     10
2022-01-01    b     100
2022-02-01    a     15
2022-03-01    a     30
2022-03-01    b     70
2022-05-01    b     60
2022-06-01    a     8
2022-06-01    b     90

Should be:

date          id   units
2022-01-01    a     10
2022-01-01    b     100
2022-02-01    a     15
2022-02-01    b     np.nan
2022-03-01    a     30
2022-03-01    b     70
2022-04-01    a     np.nan
2022-04-01    b     np.nan
2022-05-01    a     np.nan
2022-05-01    b     60
2022-06-01    a     8
2022-06-01    b     90

2 Answers2

0

You can do pivot then stack

df = df.pivot(*df.columns).stack(dropna = False).reset_index(name = 'units')
Out[126]: 
         date id  units
0  2022-01-01  a   10.0
1  2022-01-01  b  100.0
2  2022-02-01  a   15.0
3  2022-02-01  b    NaN
4  2022-03-01  a   30.0
5  2022-03-01  b   70.0
6  2022-05-01  a    NaN
7  2022-05-01  b   60.0
8  2022-06-01  a    8.0
9  2022-06-01  b   90.0
BENY
  • 317,841
  • 20
  • 164
  • 234
0
df2=(df.set_index('date' )
     .groupby('id',  group_keys=False)
     .apply(lambda x: x.resample('1MS').asfreq(fill_value=np.nan)  )
     .reset_index() )

df2['id'].ffill(inplace=True)

df2
    date        id  units
0   2022-01-01  a   10.0
1   2022-02-01  a   15.0
2   2022-03-01  a   30.0
3   2022-04-01  a   NaN
4   2022-05-01  a   NaN
5   2022-06-01  a   8.0
6   2022-01-01  b   100.0
7   2022-02-01  b   NaN
8   2022-03-01  b   70.0
9   2022-04-01  b   NaN
10  2022-05-01  b   60.0
11  2022-06-01  b   90.0
Naveed
  • 11,495
  • 2
  • 14
  • 21