Convert Pandas DataFrame from Wide to Long Format

Question

Given this DataFrame:

data = [['Tom', 'run', '2022-01-26', 'run', '2027-01-26'], ['Max', 'stop', '2020-11-16', 'run', '2022-04-26'], ['Bob', 'run', '2021-10-03', 'stop', '2022-01-26'], ['Ben', 'run', '2020-03-11', 'stop', '2013-01-26'], ['Eva', 'stop', '2017-11-16', 'run', '2015-01-26']]
df = pd.DataFrame(data, columns=['person', 'action_1', 'time_1', 'action_2', 'time_2'])


  person action_1      time_1 action_2      time_2
0    Tom      run  2022-01-26      run  2027-01-26
1    Max     stop  2020-11-16      run  2022-04-26
2    Bob      run  2021-10-03     stop  2022-01-26
3    Ben      run  2020-03-11     stop  2013-01-26
4    Eva     stop  2017-11-16      run  2015-01-26

I want it to look like:

  person action        time
0    Tom    run  2022-01-26
1    Max   stop  2020-11-16
2    Bob    run  2021-10-03
3    Ben    run  2020-03-11
4    Eva   stop  2017-11-16
5    Tom    run  2027-01-26
6    Max    run  2022-04-26
7    Bob   stop  2022-01-26
8    Ben   stop  2013-01-26
9    Eva    run  2015-01-26

https://stackoverflow.com/questions/11346283/renaming-column-names-in-pandas — gontrollez, Oct 28 '22 at 18:36
you're saying "reorder" but it looks like you're actually reshaping - check out [pd.melt](https://pandas.pydata.org/docs/reference/api/pandas.melt.html), e.g. [Reshape wide to long in pandas](https://stackoverflow.com/questions/36537945/reshape-wide-to-long-in-pandas) — Michael Delgado, Oct 28 '22 at 18:38

BeRT2me · Accepted Answer · 2022-10-28T18:55:20.437

3

This can be done using pd.wide_to_long:

df = pd.wide_to_long(df, 
                     stubnames=['action', 'time'],
                     i='person',
                     j='num',
                     sep='_').reset_index()

Output:

  person  num action        time
0    Tom    1    run  2022-01-26
1    Max    1   stop  2020-11-16
2    Bob    1    run  2021-10-03
3    Ben    1    run  2020-03-11
4    Eva    1   stop  2017-11-16
5    Tom    2    run  2027-01-26
6    Max    2    run  2022-04-26
7    Bob    2   stop  2022-01-26
8    Ben    2   stop  2013-01-26
9    Eva    2    run  2015-01-26

edited Oct 28 '22 at 18:55

answered Oct 28 '22 at 18:43

BeRT2me

12,699
2
13
31

whoa - I've never seen this usage before. hat's off :) – Michael Delgado Oct 28 '22 at 18:44
1

@MichaelDelgado It appears that what many people call "wide to long" is actually just looking for "melt", but when you *actually* want "wide to long", this function is quite useful. – BeRT2me Oct 28 '22 at 18:51
that's 100% true for me. thanks for the tip! – Michael Delgado Oct 28 '22 at 19:34
It is giving me an value Error: the id variables need to uniquely identify each row. Any tips? The exact error traces here: if df[i].duplicated().any(): raise ValueError("the id variables need to uniquely identify each row") – juststarted Oct 31 '22 at 15:30
And another value error: ValueError: stubname can't be identical to a column name – juststarted Oct 31 '22 at 16:22
Sounds like your actual data looks significantly different than the example data you have provided. – BeRT2me Oct 31 '22 at 18:18

score 0 · Answer 2 · answered Feb 08 '23 at 04:18

One option with pivot_longer from pyjanitor - it can handle non-unique indices:

# pip install pyjanitor
import janitor

(df
.pivot_longer(
    index='person', 
    names_to = ['action', 'time'], 
    names_pattern = ['action', 'time'])
)
  person action        time
0    Tom    run  2022-01-26
1    Max   stop  2020-11-16
2    Bob    run  2021-10-03
3    Ben    run  2020-03-11
4    Eva   stop  2017-11-16
5    Tom    run  2027-01-26
6    Max    run  2022-04-26
7    Bob   stop  2022-01-26
8    Ben   stop  2013-01-26
9    Eva    run  2015-01-26

Another option, using .value:

(df
.pivot_longer(
    index='person', 
    names_to = '.value', 
    names_pattern = r"(.+)_.+")
) 
  person action        time
0    Tom    run  2022-01-26
1    Max   stop  2020-11-16
2    Bob    run  2021-10-03
3    Ben    run  2020-03-11
4    Eva   stop  2017-11-16
5    Tom    run  2027-01-26
6    Max    run  2022-04-26
7    Bob   stop  2022-01-26
8    Ben   stop  2013-01-26
9    Eva    run  2015-01-26

Convert Pandas DataFrame from Wide to Long Format

2 Answers2

Linked

Related