0

I am quite new to building my own code from scratch. I'm able to read code to some level and appropriate what I find online, but I've not been able to find something that fits with my case and I would very much like to better my Python skills.

have a dataframe that takes the shape of:

INDEX PDATETIME LOT PROD MEAS_TYPE MEAS_AVG
0 2021-07-19 12:02:45 Sample1 EX Meas1 30.7124
1 2021-07-19 12:02:45 Sample1 EX Meas2 0.9992
2 2021-07-19 12:02:45 Sample1 EX Meas3 0.3948
3 2021-07-19 12:03:45 Sample2 EX Meas1 0.5185
4 2021-07-19 12:03:45 Sample2 EX Meas2 0.2171
5 2021-07-19 12:03:45 Sample2 EX Meas3 0.9885

I am looking to convert it to the following:

INDEX PDATETIME LOT PROD MEAS1 MEAS2 MEAS3
0 2021-07-19 12:02:45 Sample1 EX 30.7124 0.9992 0.3948
1 2021-07-19 12:03:45 Sample2 EX 0.5185 0.2171 0.9885

I've tried using pivot, and it would work if I didn't have the additional columns (e.g. LOT, PROD).

Some additional things to consider: the number of samples present in LOT can be anywhere from 1 to over a million. The PROD isn't limited to a unique value (e.g. EX). MEAS_TYPE can contain anywhere from 1 to dozens of parameters.

Your help is much appreciated!

Ryan
  • 1
  • 1
  • `df.pivot(index=['PDATETIME', 'LOT', 'PROD'], columns='MEAS_TYPE', values='MEAS_AVG').reset_index().rename_axis(None, axis=1)` – Henry Ecker Jul 23 '21 at 15:25
  • 1
    Yes it does! Thank you very much. I'll have to spend some time unpacking what you've done, but that will be good for me. :) – Ryan Jul 23 '21 at 15:31
  • Columns you want to "save" go into the index. The column that has the labels for the new columns goes in "columns" and the values that should go under those new columns goes in "values". – Henry Ecker Jul 23 '21 at 15:36
  • More manually, you could do something like this `df.set_index(['PDATETIME', 'LOT', 'PROD', 'MEAS_TYPE']).drop(columns=['INDEX']).unstack()` – ifly6 Jul 23 '21 at 16:15

0 Answers0