Given the following data:
data = pd.DataFrame(
{
"A": ["a", "a", "b", "b"],
"B": ["x", "y", "p", "q"],
"C": ["one", "two", "one", "two"],
}
)
which looks as:
A B C
0 a x one
1 a y two
2 b p one
3 b q two
I would like to create the following:
data_out = pd.DataFrame(
{
"A": ["a", "b"],
"one": ["x", "p"],
"two": ["y", "q"],
}
)
which looks as:
A one two
0 a x y
1 b p q
I'm aware that I could do something along the lines of:
d_piv = pd.pivot_table(
data,
index=["A"],
columns=["C"],
values=["B"],
aggfunc=lambda x: x,
).reset_index()
which gives:
A B
C one two
0 a x y
1 b p q
from which the columns could be cleaned up, but I'm wondering how I'd go about solving this using melt and unstack?
I have tried:
print(data.set_index("C", append=True).unstack())
which gives:
A B
C one two one two
0 a NaN x NaN
1 NaN a NaN y
2 b NaN p NaN
3 NaN b NaN q
The NaN values aren't wanted here, so I could instead try:
data.index = [0, 0, 1, 1]
data.set_index(["A", "C"], append=True).unstack(-1).reset_index(level=-1)
which gives:
A B
C one two
0 a x y
1 b p q
So that's closer - but it still feels as though there's still some unnecessary bits there.
Particularly coding the index like that.
Edit
Solution of :
df.set_index('A').pivot(columns='C', values='B').reset_index().rename_axis(None, axis=1)
is good, but I am wondering whether unstack
can be used here instead of pivot?