I have a dataframe "iris" which have three different species, how to create a dataframe with all different specieas as column names and sepal length as the values?
Setup:
import numpy as np
import pandas as pd
df = sns.load_dataset('iris')
print(df.head())
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
I can do manually
This is a long and difficult way, I am looking for some groupby method that does this in a single operation possibly using chain.
seto = df.loc[df['species']=='setosa','sepal_length']
ver = df.loc[df['species']=='versicolor','sepal_length']
vir = df.loc[df['species']=='virginica','sepal_length']
ans = pd.DataFrame({'setosa': seto.values,'versicolor':ver.values,'virginica':vir.values})
print(ans.head())
Required output
df.groupby('species')['spepal_length'].SOMETHING.SOMETHING
setosa versicolor virginica
0 5.1 7.0 6.3
1 4.9 6.4 5.8
2 4.7 6.9 7.1
3 4.6 5.5 6.3
4 5.0 6.5 6.5