I am quite new to building my own code from scratch. I'm able to read code to some level and appropriate what I find online, but I've not been able to find something that fits with my case and I would very much like to better my Python skills.
have a dataframe that takes the shape of:
INDEX | PDATETIME | LOT | PROD | MEAS_TYPE | MEAS_AVG |
---|---|---|---|---|---|
0 | 2021-07-19 12:02:45 | Sample1 | EX | Meas1 | 30.7124 |
1 | 2021-07-19 12:02:45 | Sample1 | EX | Meas2 | 0.9992 |
2 | 2021-07-19 12:02:45 | Sample1 | EX | Meas3 | 0.3948 |
3 | 2021-07-19 12:03:45 | Sample2 | EX | Meas1 | 0.5185 |
4 | 2021-07-19 12:03:45 | Sample2 | EX | Meas2 | 0.2171 |
5 | 2021-07-19 12:03:45 | Sample2 | EX | Meas3 | 0.9885 |
I am looking to convert it to the following:
INDEX | PDATETIME | LOT | PROD | MEAS1 | MEAS2 | MEAS3 |
---|---|---|---|---|---|---|
0 | 2021-07-19 12:02:45 | Sample1 | EX | 30.7124 | 0.9992 | 0.3948 |
1 | 2021-07-19 12:03:45 | Sample2 | EX | 0.5185 | 0.2171 | 0.9885 |
I've tried using pivot, and it would work if I didn't have the additional columns (e.g. LOT, PROD).
Some additional things to consider: the number of samples present in LOT can be anywhere from 1 to over a million. The PROD isn't limited to a unique value (e.g. EX). MEAS_TYPE can contain anywhere from 1 to dozens of parameters.
Your help is much appreciated!