2

Given this DataFrame:

data = [['Tom', 'run', '2022-01-26', 'run', '2027-01-26'], ['Max', 'stop', '2020-11-16', 'run', '2022-04-26'], ['Bob', 'run', '2021-10-03', 'stop', '2022-01-26'], ['Ben', 'run', '2020-03-11', 'stop', '2013-01-26'], ['Eva', 'stop', '2017-11-16', 'run', '2015-01-26']]
df = pd.DataFrame(data, columns=['person', 'action_1', 'time_1', 'action_2', 'time_2'])


  person action_1      time_1 action_2      time_2
0    Tom      run  2022-01-26      run  2027-01-26
1    Max     stop  2020-11-16      run  2022-04-26
2    Bob      run  2021-10-03     stop  2022-01-26
3    Ben      run  2020-03-11     stop  2013-01-26
4    Eva     stop  2017-11-16      run  2015-01-26

I want it to look like:

  person action        time
0    Tom    run  2022-01-26
1    Max   stop  2020-11-16
2    Bob    run  2021-10-03
3    Ben    run  2020-03-11
4    Eva   stop  2017-11-16
5    Tom    run  2027-01-26
6    Max    run  2022-04-26
7    Bob   stop  2022-01-26
8    Ben   stop  2013-01-26
9    Eva    run  2015-01-26
BeRT2me
  • 12,699
  • 2
  • 13
  • 31
  • https://stackoverflow.com/questions/11346283/renaming-column-names-in-pandas – gontrollez Oct 28 '22 at 18:36
  • you're saying "reorder" but it looks like you're actually reshaping - check out [pd.melt](https://pandas.pydata.org/docs/reference/api/pandas.melt.html), e.g. [Reshape wide to long in pandas](https://stackoverflow.com/questions/36537945/reshape-wide-to-long-in-pandas) – Michael Delgado Oct 28 '22 at 18:38

2 Answers2

3

This can be done using pd.wide_to_long:

df = pd.wide_to_long(df, 
                     stubnames=['action', 'time'],
                     i='person',
                     j='num',
                     sep='_').reset_index()

Output:

  person  num action        time
0    Tom    1    run  2022-01-26
1    Max    1   stop  2020-11-16
2    Bob    1    run  2021-10-03
3    Ben    1    run  2020-03-11
4    Eva    1   stop  2017-11-16
5    Tom    2    run  2027-01-26
6    Max    2    run  2022-04-26
7    Bob    2   stop  2022-01-26
8    Ben    2   stop  2013-01-26
9    Eva    2    run  2015-01-26
BeRT2me
  • 12,699
  • 2
  • 13
  • 31
  • whoa - I've never seen this usage before. hat's off :) – Michael Delgado Oct 28 '22 at 18:44
  • 1
    @MichaelDelgado It appears that what many people call "wide to long" is actually just looking for "melt", but when you *actually* want "wide to long", this function is quite useful. – BeRT2me Oct 28 '22 at 18:51
  • that's 100% true for me. thanks for the tip! – Michael Delgado Oct 28 '22 at 19:34
  • It is giving me an value Error: the id variables need to uniquely identify each row. Any tips? The exact error traces here: if df[i].duplicated().any(): raise ValueError("the id variables need to uniquely identify each row") – juststarted Oct 31 '22 at 15:30
  • And another value error: ValueError: stubname can't be identical to a column name – juststarted Oct 31 '22 at 16:22
  • Sounds like your actual data looks significantly different than the example data you have provided. – BeRT2me Oct 31 '22 at 18:18
0

One option with pivot_longer from pyjanitor - it can handle non-unique indices:

# pip install pyjanitor
import janitor

(df
.pivot_longer(
    index='person', 
    names_to = ['action', 'time'], 
    names_pattern = ['action', 'time'])
)
  person action        time
0    Tom    run  2022-01-26
1    Max   stop  2020-11-16
2    Bob    run  2021-10-03
3    Ben    run  2020-03-11
4    Eva   stop  2017-11-16
5    Tom    run  2027-01-26
6    Max    run  2022-04-26
7    Bob   stop  2022-01-26
8    Ben   stop  2013-01-26
9    Eva    run  2015-01-26

Another option, using .value:

(df
.pivot_longer(
    index='person', 
    names_to = '.value', 
    names_pattern = r"(.+)_.+")
) 
  person action        time
0    Tom    run  2022-01-26
1    Max   stop  2020-11-16
2    Bob    run  2021-10-03
3    Ben    run  2020-03-11
4    Eva   stop  2017-11-16
5    Tom    run  2027-01-26
6    Max    run  2022-04-26
7    Bob   stop  2022-01-26
8    Ben   stop  2013-01-26
9    Eva    run  2015-01-26
sammywemmy
  • 27,093
  • 4
  • 17
  • 31