0

I am building a backtesting system for stock trading and have a dataframe of orders. Date_buy indicates when buying Name_buy. The switch indicates selling and buying new stocks.

Date_buy    Name_buy
0   2003-05-14  TLT
1   2004-01-14  TLT
2   2004-04-14  VO
3   2004-07-14  TLT
4   2005-01-03  VO
... ... ...
70  2019-11-01  VO
71  2020-02-03  TLT
72  2020-04-01  VO
73  2020-07-01  TLT
74  2020-09-01  VO

I want to fill the dates and keep value for each date:

Date_buy    Name_buy
0   2003-05-14  TLT
1   2003-05-15  TLT
2   2003-05-16  TLT
3   2003-05-17  TLT
4   2003-05-18  TLT
... ... ...
70  2019-11-01  VO
71  2019-11-02  VO
72  2019-11-03  VO
73  2019-11-04  VO
74  2019-11-05  VO

Obviously, in above example the index numbers would be much higher for dates in 2019 VO.

I tried following this solution but it didn't work. It returns the dates correct, but all other value as 0, caused by fill = 0.

idx = pd.period_range(min(df.date), max(df.date))
    ...: results.reindex(idx, fill_value=0)
doomdaam
  • 691
  • 1
  • 6
  • 21

1 Answers1

1

First convert your column Date_buy if this is not already done. Set it to index and use resample and pad to get your result.

# Convert Date_buy to datetime
df['Date_buy'] = pd.to_datetime(df['Date_buy'])

df = df.set_index('Date_buy').resample('D').pad().reset_index()
>>> df
       Date_buy Name_buy
0    2003-05-14      TLT
1    2003-05-15      TLT
2    2003-05-16      TLT
3    2003-05-17      TLT
4    2003-05-18      TLT
...         ...      ...
6316 2020-08-28      TLT
6317 2020-08-29      TLT
6318 2020-08-30      TLT
6319 2020-08-31      TLT
6320 2020-09-01       VO
Corralien
  • 109,409
  • 8
  • 28
  • 52