I have a dataframe that contains multiple entries for single conditions. I want to combine the multiple entries into one row so that each condition is described in one row.
The input data looks like this:
df = pd.DataFrame({"well" : ['E05', 'E05', 'E06', 'E06', 'F05', 'F05', 'F06', 'F06'],
"compound" : ['ABC', 'DMSO', 'DEF', 'XYZ', 'GHI', 'DMSO', 'JKL', 'XYZ'],
"number" : [34, 23, 16, 92, 22, 56, 44, 23]})
Which when printed is this:
well condition number
0 E05 ABC 34
1 E05 UVW 23
2 E06 DEF 16
3 E06 XYZ 92
4 F05 GHI 22
5 F05 UVW 56
6 F06 JKL 44
7 F06 XYZ 23
I would like to generate a dataframe as below:
well condition_1 number_1 condition_2 number_2
0 E05 ABC 34 UVW 23
1 E06 DEF 16 XYZ 92
2 F05 GHI 22 UVW 56
3 F06 JKL 44 XYZ 23
I've tried using df.groupby(by="well"), but have not figured out a way to generate a dataframe from the resulting groupby object. Functions such as count() or describe() don't work because then I get summary statistics for number (and the compound gets ignored entirely), when I want each reported separately.
My next guess is to create a for loop to cycle through every unique value in the well series and pull the corresponding entries, but I'm wondering if there's a more efficient/pythonic way to do this.
Thanks for any help.