How do I refer to extracted start and end dates in order to sample a timeseries with date_range?
I get now 'DataFrame' object has no attribute 'date_range'.' (I would be happy with any solution that does not cause error with the sampling start and end dates that differ typically for different SNs).
In:
import pandas as pd
import numpy as np
df = pd.read_csv("csv.csv", sep=';', error_bad_lines=False)
df['TimeDate'] = pd.to_datetime(df['TimeDate'],dayfirst=True)
gdf = df.groupby(['SN'],as_index = False)
dates = pd.DataFrame(gdf.nth(0)) # pd.DataFrame used here to avoid a SettingwithCopyWarning.
dates['StartD'] = gdf.nth(0)['TimeDate'].array
dates['EndD'] = gdf.nth(-1)['TimeDate'].array
# Little bit odd -> managed to write the first rows only.
df = df.join(dates.reindex(columns=['StartD', 'EndD']))
# Grouping by 'SN'. -> Trying to samnple
gdf = df.groupby(['SN'],as_index = False)
for k, gp in gdf:
print('key=' + str(k))
print(gp.date_range(start=gdf['StartD'], end=gdf['EndD'], freq='D'))
Out:
SN TimeDate ... StartD EndD
0 sn000066 2016-09-28 00:01:00 ... 2016-09-28 00:01:00 2016-10-03 23:59:00
1 sn000066 2016-09-29 00:01:00 ...
...
n sn000066 2016-10-02 00:01:00 ...