2

I have a data frame as follows.

pd.DataFrame({'Date':['2020-08-01','2020-08-01','2020-09-01'],'value':[10,12,9],'item':['a','d','b']})

I want to convert this to weekly data keeping all the columns apart from the Date column constant.

Expected output

pd.DataFrame({'Date':['2020-08-01','2020-08-08','2020-08-15','2020-08-22','2020-08-29','2020-08-01','2020-08-08','2020-08-15','2020-08-22','2020-08-29','2020-09-01','2020-09-08','2020-09-15','2020-09-22','2020-09-29'],
          'value':[10,10,10,10,10,12,12,12,12,12,9,9,9,9,9],'item':['a','a','a','a','a','d','d','d','d','d','b','b','b','b','b']})

It should be able to convert any month data to weekly data. Date in the input data frame is always the first day of that month. How do I make this happen? Thanks in advance.

NAS_2339
  • 353
  • 2
  • 13
  • Does this answer your question? [How can I resample (upsample) my Pandas Dataframe?](https://stackoverflow.com/questions/53265779/how-can-i-resample-upsample-my-pandas-dataframe) – RichieV Nov 03 '20 at 14:42
  • to go from monthly to weekly rows you would have to choose something like `7d` period for resampling, if that specific answer doesn't fix your case just search around for _"pandas resample"_ – RichieV Nov 03 '20 at 14:43

2 Answers2

2

Since the desired new datetime index is irregular (re-starts at the 1st of each month), an iterative creation of the index is an option:

df = pd.DataFrame({'Date':['2020-08-01','2020-09-01'],'value':[10,9],'item':['a','b']})

df = df.set_index(pd.to_datetime(df['Date'])).drop(columns='Date')

dti = pd.to_datetime([]) # start with an empty datetime index
for month in df.index: # for each month, add a 7-day step datetime index to the previous
    dti = dti.union(pd.date_range(month, month+pd.DateOffset(months=1), freq='7d'))

# just reindex and forward-fill, no resampling needed
df = df.reindex(dti).ffill()
df
            value item
2020-08-01   10.0    a
2020-08-08   10.0    a
2020-08-15   10.0    a
2020-08-22   10.0    a
2020-08-29   10.0    a
2020-09-01    9.0    b
2020-09-08    9.0    b
2020-09-15    9.0    b
2020-09-22    9.0    b
2020-09-29    9.0    b
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
  • I have edited the question with the expected output with an extra row of input. My expectation is a little bit different here. I want to resample the monthly data to weekly data.@MrFuppes – NAS_2339 Nov 03 '20 at 17:51
  • But I'm not able to get the required output with the pandas.resample method. – NAS_2339 Nov 03 '20 at 18:02
  • @Naseef - made an edit, let me know if that works for you. – FObersteiner Nov 03 '20 at 19:04
  • Hey, I have edited the question. I'm getting 'ValueError: cannot reindex from a duplicate axis' in this case. I guess it is because we have two items with the same Date. How do I tackle this? – NAS_2339 Nov 04 '20 at 05:26
  • @Naseef: that is painful ^^ - of course if you have duplicate entries in the date column, you can't use the indexing method I describe since it requires unique datetime values. You'd have to add a time or something, not sure if resampling would work either. – FObersteiner Nov 04 '20 at 06:53
1

I added one more date to your data and then used resample:

df = pd.DataFrame({'Date':['2020-08-01', '2020-09-01'],'value':[10, 9],'item':['a', 'b']})

df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
df = df.resample('W').ffill().reset_index()
print(df)


        Date  value item
0 2020-08-02     10    a
1 2020-08-09     10    a
2 2020-08-16     10    a
3 2020-08-23     10    a
4 2020-08-30     10    a
5 2020-09-06      9    b
NYC Coder
  • 7,424
  • 2
  • 11
  • 24