I have the following dataframe:
ID Datetime Measurement A B C
0 1000 2009-01-01 00:29:59 0.117 0 0 1
1 1000 2009-01-01 00:59:59 0.050 0 0 1
2 1000 2009-01-01 01:29:59 0.025 0 0 1
3 1000 2009-01-01 01:59:59 0.025 0 0 1
4 1000 2009-01-01 02:29:59 0.049 0 0 1
... ... ... ... ... ...
157992991 7444 2010-04-06 21:59:59 1.457 1 0 0
157992992 7444 2010-04-06 22:29:59 1.811 1 0 0
157992993 7444 2010-04-06 22:59:59 1.054 1 0 0
157992994 7444 2010-04-06 23:29:59 0.925 1 0 0
157992995 7444 2010-04-06 23:59:59 1.003 1 0 0
[157992996 rows x 6 columns]
The first column represents the ID
of a consumer. The second represents the Datetime
when the Measurement
of the third column was taken. This measurements are taken half-hourly every day. And the three last columns, A
, B
and C
is an one-hot-encode information regarding the type of consumer.
What i need here is to change the format of this dataframe, to one where in each row it should have 48 measurements (1 day of half-hour measurements) followed by the one-hot-encode information. The columns should be only the HH:MM:SS (the YY-MM-DD does not matter) and the one-hot-encode variables A
, B
and C
. For example, it should looke like this for the first three rows (that would be related to measurements from the ID 1000 in 2009-01-01
until 2009-01-03
) and the last row (related the measurements from the ID 7444 in 2010-04-06
):
00:29:59 00:59:59 01:29:59 01:29:59 ... 23:29:59 23:59:59 A B C
0 0.117 0.050 0.025 0.025 ... 0.280 0.133 0 0 1 <--- 2009-01-01 from ID 1000
1 0.237 0.251 0.028 0.172 ... 0.223 0.111 0 0 1 <--- 2009-01-02 from ID 1000
2 0.157 0.122 0.039 0.101 ... 0.321 0.139 0 0 1 <--- 2009-01-03 from ID 1000
... ... ... ... ... ... ... ...
3,291,520 0.433 0.541 0.829 0.901 ... 0.925 1.003 1 0 0 <--- 2010-04-06 from ID 7444
So far, i get the column names like this:
cols_name = df['Datetime'][0:48].dt.strftime('%H:%M:%S').values
Not really sure how can i proceed from here, any ideas?
EDIT: None of the examples in the suggesed answer solves my problem. It is a different kind of pivot that is not there.