First, I want to forward fill my data for EACH UNIQUE VALUE in Group_Id
by 1S
, so basically grouping by Group_Id
then resample using ffill
.
Here is the data:
Id Timestamp Data Group_Id
0 1 2018-01-01 00:00:05.523 125.5 101
1 2 2018-01-01 00:00:05.757 125.0 101
2 3 2018-01-02 00:00:09.507 127.0 52
3 4 2018-01-02 00:00:13.743 126.5 52
4 5 2018-01-03 00:00:15.407 125.5 50
...
11 11 2018-01-01 00:00:07.523 125.5 120
12 12 2018-01-01 00:00:08.757 125.0 120
13 13 2018-01-04 00:00:14.507 127.0 300
14 14 2018-01-04 00:00:15.743 126.5 300
15 15 2018-01-05 00:00:19.407 125.5 350
I previously did this:
def daily_average_temperature(dfdf):
INDEX = dfdf[['Group_Id','Timestamp','Data']]
INDEX['Timestamp']=pd.to_datetime(INDEX['Timestamp'])
INDEX = INDEX.set_index('Timestamp')
INDEX1 = INDEX.resample('1S').last().fillna(method='ffill')
return T_index1
This is wrong as it didn't group the data with different value of Group_Id
first but rather ignoring the column.
Second, I would like to spread the Data values so each row is a group_id with index as columns replacing Timestamp
, looks something like this:
x0 x1 x2 x3 x4 x5 ... Group_Id
0 40 31.05 25.5 25.5 25.5 25 ... 1
1 35 35.75 36.5 36.5 36.5 36.5 ... 2
2 25.5 25.5 25.5 25.5 25.5 25.5 ... 3
3 25.5 25.5 25.5 25.5 25.5 25.5 ... 4
4 25 25 25 25 25 25 ... 5
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
Please note that this table above is not related to the previous dataset but just used to show the format.
Thanks