Trying to convert pandas DataFrames from wide to long format.
I've tried to melt()
, use wide_to_long()
(the easy melt()
), yet kept being confused with the syntax and the output I received.
I've also read through many posts on SO and the web about this topic and tried quite some proposed approaches, yet the results were never what I was looking for.
This post helped me to discover unstack()
- and I finally managed to get the result I wanted using it twice in a row: df.unstack().unstack()
.
I'm sure that this is not the best way to do this and was hoping for a tip! Here's my example:
import pandas as pd
# an example df (the real data makes more sense):
series_list = [
pd.Series(list("hello! hello!"), name='greeting'),
pd.Series(list("stackoverflow"), name='name'),
pd.Series(list("howsit going?"), name='question')
]
wide_df = pd.DataFrame(series_list)
Creating a df like that always gives me one in wide format:
0 1 2 3 4 5 6 7 8 9 10 11 12
greeting h e l l o ! h e l l o !
name s t a c k o v e r f l o w
question h o w s i t g o i n g ?
However, I'd want the pd.Series()
s name=
attribute to become the column names.
What worked for me is the mentioned df.unstack().unstack()
:
greeting name question
0 h s h
1 e t o
2 l a w
3 l c s
4 o k i
5 ! o t
6 v
7 h e g
8 e r o
9 l f i
10 l l n
11 o o g
12 ! w ?
But this sure is clunky and there must be a better way!
Thanks and have a good day : )