0

I've got a DataFrame that looks like this:

It has two columns, one of them being a "from" datetime and one of them being a "to" datetime. I would like to change this DataFrame such that it has a single column or index for the date (e.g. 2015-07-06 00:00:00 in datetime form) with the variables of the other columns (like deep) split proportionately into each of the days. How might one approach this problem? I've meddled with groupby tricks and I'm not sure how to proceed.

BlandCorporation
  • 1,324
  • 1
  • 15
  • 33
  • Please don't post images of your data. Make it easy for us to cut and paste to recreate your issue. [How to create good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). Also, can you be more explicit about what your desired output is and how you get it? – pault Feb 23 '18 at 21:38

1 Answers1

0

So I don't have time to work through your specific problem at the moment. But the way to approach this is to us pandas.resample(). Here are the steps I would take. 1) Resample your to date column by minute. 2) Populate the other columns out over that resample. 3) Add the date column back in as an index.

If this doesn't work or is being tricky to work with I would create a date range from your earliest date to your latest date (at the smallest interval you want - so maybe hourly?) and then run some conditional statements over your other columns to fill in the data.

Here is somewhat what your code may look like for the resample portion (replace day with hour or whatever):

  drange = pd.date_range('01-01-1970', '01-20-2018', freq='D')
  data = data.resample('D').fillna(method='ffill')
  data.index.name = 'date'    

Hope this helps!

Stephen Strosko
  • 597
  • 1
  • 5
  • 18