There is a similar question with a solution not fully fitting my needs. And I do not understand all details of the solution their so I am not able to adapt it to my situation.
This is my initial dataframe where all unique values in the Y
column should become a column.
Y P v
0 A X 0
1 A Y 1
2 B X 2
3 B Y 3
4 C X 4
5 C Y 5
The result should look like this where P
is the first column or it could be the index
also. So P
could be understood as a row heading. And the values from 'Y' are the column headings. And the values from v
are in each cell now.
P A B C
0 X 0 2 4
1 Y 1 3 5
Not working approach
This is based on https://stackoverflow.com/a/52082963/4865723
new_index = ['Y', df.groupby('Y').cumcount()]
final = df.set_index(new_index)
final = final['P'].unstack('Y')
print(final)
The problem here is that the index
(or first column) does not contain the values from Y
and the v
column is totally gone.
Y A B C
0 X X X
1 Y Y Y
My own unfinished idea
>>> df.groupby('Y').agg(list)
P v
Y
A [X, Y] [0, 1]
B [X, Y] [2, 3]
C [X, Y] [4, 5]
I do not know if this help or how to go further from this point on.
The full MWE
#!/usr/bin/env python3
import pandas as pd
# initial data
df = pd.DataFrame({
'Y': ['A', 'A', 'B', 'B', 'C', 'C'],
'P': list('XYXYXY'),
'v': range(6)
})
print(df)
# final result I want
final = pd.DataFrame({
'P': list('XY'),
'A': [0, 1],
'B': [2, 3],
'C': [4, 5]
})
print(final)
# approach based on:
# https://stackoverflow.com/a/52082963/4865723
new_index = ['Y', df.groupby('Y').cumcount()]
final = df.set_index(new_index)
final = final['P'].unstack('Y')
print(final)