You need use pivot
or set_index
+ unstack
for reshape:
df1 = df.pivot(index='Name', columns='Date', values='Height')
print (df1)
Date 1 2 3
Name
a 12 14 15
b 10 11 12
c 11 13 14
df1 = df.set_index(['Name', 'Date'])['Height'].unstack(fill_value=0)
print (df1)
Date 1 2 3
Name
a 12 14 15
b 10 11 12
c 11 13 14
But if get:
ValueError: Index contains duplicate entries, cannot reshape
there are duplicates in columns created new index and columns:
df = pd.DataFrame({
"Name": ["b", "b", "a", "b", "a", "c","a", "b", "c"],
"Date": ["1", "1", "1", "2", "2", "2","3", "3", "3"],
"Height": ["10", "11", "12", "11", "14", "13","15", "12", "14"]
})
print (df)
Date Height Name
0 1 10 b <-duplicate Date=1, Name=b
1 1 11 b <-duplicate Date=1, Name=b
2 1 12 a
3 2 11 b
4 2 14 a
5 2 13 c
6 3 15 a
7 3 12 b
8 3 14 c
Then need pivot_table
with some aggregate function like mean
, sum
or use groupby
+ aggregate function
+ unstack
:
But if get:
DataError: No numeric types to aggregate
it means column Height
for aggregate is not numeric, so use astype
for cast to int
or float
:
df['Height'] = df['Height'].astype(int)
df1 = df.pivot_table(index='Name',
columns='Date',
values='Height',
aggfunc='mean', fill_value=0)
print (df1)
Date 1 2 3
Name
a 12.0 14 15
b 10.5 11 12
c 0.0 13 14
df1 = df.groupby(['Name', 'Date'])['Height'].mean().unstack(fill_value=0)
print (df1)
Date 1 2 3
Name
a 12.0 14.0 15.0
b 10.5 11.0 12.0
c 0.0 13.0 14.0
For columns add reset_index
+ rename_axis
:
df1 = df1.reset_index().rename_axis(None, axis=1)
print (df1)
Name 1 2 3
0 a 12.0 14.0 15.0
1 b 10.5 11.0 12.0
2 c 0.0 13.0 14.0