How to transpose time-series by unique values in a column

Question

My data looks like this:

    Id           Timestamp         Data    Group_Id
0    1     2018-01-01 00:00:05.523 40      1
1    2     2018-01-01 00:00:10.757 31.5    1
2    3     2018-01-01 00:00:15.507 25.5    1
                   ...
3    4     2018-01-01 00:00:05.743 35      2
4    5     2018-01-01 00:00:10.407 35.75   2
                   ...

I want to group data by Group_Id so that each unique value in Group_Id is one row, while transposing the timestamps into a time series with each timestamp in a column like below:

            x0      x1      x2      x3      x4      x5      ...
1           40      31.05   25.5    25.5    25.5    25      ...
2           35      35.75   36.5    36.5    36.5    36.5    ...
⋮           ⋮       ⋮        ⋮       ⋮       ⋮        ⋮

Then I want to rename the timestamps according to time order, ie. the x0 replacing the first timestamp, x1 the second ...

I checked the df.pivot_table documentation and couldn't find a relevant function that sorts time series. And discussion on this link did not help me with this problem.

You were reading the correct solution. Check out question 10 in the discussion. In your case, `Data` is `B`, `Group_Id` is `A`. — Quang Hoang, Oct 14 '19 at 22:41
I checked question 10. Why do they have to insert 0 here? `df2.insert(0, 'count', df2.groupby('A').cumcount())` — nilsinelabore, Oct 14 '19 at 22:53
1. Insert syntax: `df.insert(location, name, values)`. You don't even need `insert`. You can make do with just `df['count'] = df.groupby('Group_Id').cumcount()`. 2. The question implicitly says you need to do `df.pivot(index='Group_Id', columns='count', values='Data')`. — Quang Hoang, Oct 14 '19 at 23:27
Use this: `new_df=df.copy() new_df['Timestamp']='x'+new_df.groupby('Group_Id')['Timestamp'].cumcount().astype(str) new_df.pivot_table(index='Group_Id',columns='Timestamp',values='Data')` — ansev, Oct 14 '19 at 23:34
Thanks @ansev. may I know what `new_df.columns.name=None` does here? I ran and compare the code it didn't seem to bring any change — nilsinelabore, Oct 14 '19 at 23:48
@ansev Checked my code again and still don't see any change:/ — nilsinelabore, Oct 14 '19 at 23:56
you should see that the name 'Timestamp' disappears at the top left of the dataframe — ansev, Oct 14 '19 at 23:58
@ansev Yeah that is what I wanted to do, but it refuses to disappear... — nilsinelabore, Oct 15 '19 at 00:01
@ansev And some values are `NaN`, any idea what could be the reason? — nilsinelabore, Oct 15 '19 at 00:03
@ansev I notice that pivoting distorts the order which gives `Timestamp x0 x1 x10 x100 x1000 x1001 x1002 x1003 x1004 x1005 x1006 x1007 x1008 x1009 x101 x1010 x1011 x1012 ...` for the column headings — nilsinelabore, Oct 15 '19 at 00:08
There are larger groups than others, it is normal for NaN... — ansev, Oct 15 '19 at 00:14
@ansev Yeah no worries it's sorted, but the order of indexing is still confusing... — nilsinelabore, Oct 15 '19 at 00:21
@nilsinelabore - Can you test `df['Timestamp']=df.groupby('Group_Id')['Timestamp'].cumcount() df = df.pivot(index='Group_Id',columns='Timestamp',values='Data').add_prefix('x')` ? — jezrael, Oct 15 '19 at 09:10

How to transpose time-series by unique values in a column

0 Answers0

Linked