Pivot table in Pandas to "unmelt" values in specified column into individual columns

Question

I have a Pandas data frame:

timestamp   device_id   metric_name     metric_value
2020-10-20  6C0301      throughput      5.0
2020-10-21  6C0301      throughput      6.3
2020-10-20  6C0301      cache           4.7
2020-10-21  6C0301      cache           2.1
2020-10-20  6C0302      throughput      1.4
2020-10-21  6C0302      throughput      1.8
2020-10-22  6C0302      blocks          9.3
2020-10-23  6C0302      blocks          7.2

So different devices at different times, each with their own set of metric_names and metric_values.

I need to produce a table that looks like this:

timestamp   device_id   throughput   cache   blocks   
2020-10-20  6C0301      5.0          4.7     NULL
2020-10-21  6C0301      6.3          2.1     NULL
2020-10-20  6C0302      1.4          NULL    NULL
2020-10-21  6C0302      1.8          NULL    NULL
2020-10-22  6C0302      NULL         NULL    9.3
2020-10-23  6C0302      NULL         NULL    7.2

So the values in the metric_name column gets "unmelted" into its own individual columns.

I am looking at Pandas pivot_table but it gives me wrong result:

pd.pivot_table(df, values='metric_name', index=['timestamp', 'device_id'], columns=['metric_value'])

Tells me "DataError: No numeric types to aggregate" even though the metric_value column is float, and I am not looking to aggregate anything.

Use `df = pd.pivot_table(df, values='metric_value', index=['timestamp', 'device_id'], columns='metric_name')` — jezrael, Oct 22 '20 at 12:31
This is really close, although it produces multiple indexes. one of which has both an index and column name on top of each other(?)....Any way to only have regular index? — Cybernetic, Oct 22 '20 at 12:40
All together `df = pd.pivot_table(df, values='metric_value', index=['timestamp', 'device_id'], columns='metric_name').reset_index().rename_axis(None, axis=1)` — jezrael, Oct 22 '20 at 12:42
Or use `df.set_index(["timestamp", "device_id", "metric_name"]).unstack("metric_name")`. — Henry Yik, Oct 22 '20 at 12:43
@jezrael Yes, that last one did it, except I also had to add aggfunt='first' — Cybernetic, Oct 22 '20 at 12:46
@HenryYik - yop, it is another way if no aggregation, or `df = pd.pivot(df, values='metric_value', index=['timestamp', 'device_id'], columns='metric_name')` — jezrael, Oct 22 '20 at 12:46
I think `aggfunc=first` is possible use, but then is possible data lost, if duplicates. So if no problem. do it. — jezrael, Oct 22 '20 at 12:47
@jezrael It won't run without it. It shows error "no numeric types to aggregate" if no aggregation provided. See end of question. — Cybernetic, Oct 22 '20 at 12:48
ya, there is typo - swapped `values='metric_name' `and `columns=['metric_value']` to `values='metric_value'` and `columns='metric_name'` — jezrael, Oct 22 '20 at 12:49
@jezrael This is what works: pd.pivot_table(df, values='metric_value', index=['timestamp', 'device_id'], columns='metric_name', aggfunc='first').reset_index().rename_axis(None, axis=1) — Cybernetic, Oct 22 '20 at 12:50

Pivot table in Pandas to "unmelt" values in specified column into individual columns

0 Answers0