Pandas DataFrame Wide to Long based on first column values

Question

I would like to transform a Pandas DataFrame of the following wide format

df = pd.DataFrame([['A', '1', '2', '3'], ['B', '4', '5', '6'], ['C', '7', '8', '9']], columns=['ABC', 'def', 'ghi', 'jkl'])

df =
   ABC  def  ghi  jkl
0    A    1    2    3
1    B    4    5    6
2    C    7    8    9

into a long format, where the values from the first column still correspond to the values in the lower-case columns. The column names cannot be used as stub names. The names of the new columns are irrelevant and could be renamed later.

The output should look something like this:

I am not sure how to best and efficiently do this. Can this be done with wide_to_long()? Then I would not know how to deal with stub names. The best would be an efficient one-liner that can be used on a large table.

Many thanks!!

jezrael · Answer 1 · 2021-06-29T09:26:47.843

2

Use DataFrame.melt with DataFrame.sort_index and remove variable column:

df1 = (df.melt("ABC", value_name='new', ignore_index=False)
          .sort_index(ignore_index=True)
          .drop('variable', axis=1)
         )
print (df1)
  ABC new
0   A   1
1   A   2
2   A   3
3   B   4
4   B   5
5   B   6
6   C   7
7   C   8
8   C   9

If need more dynamic solution with generate first value of columns names:

first = df.columns[0]
df1 = (df.melt(first, value_name='new', ignore_index=False)
         .sort_index(ignore_index=True)
         .drop('variable', axis=1))

edited Jun 29 '21 at 09:26

answered Jun 29 '21 at 08:35

jezrael

822,522
95
1,334
1,252

2

do you think this is a [dupe](https://stackoverflow.com/questions/36537945/reshape-wide-to-long-in-pandas) ? – Umar.H Jun 29 '21 at 08:36
1

@Umar.H - Almost, there is not removed column `variable` – jezrael Jun 29 '21 at 08:37
Many thanks! However, df.melt does not necessarily preserve the order if you have duplicate values in the ABC column I believe. – Fhtsm Jun 29 '21 at 09:25
@Fhtsm - Answer was edited by `sort_index` instead `sort_values` – jezrael Jun 29 '21 at 09:26
@Fhtsm - and also `ignore_index=False` added to `melt` – jezrael Jun 29 '21 at 09:28

Corralien · Accepted Answer · 2021-06-29T08:38:30.687

1

You can use df.stack:

>>> df.set_index('ABC') \
      .stack() \
      .reset_index(level='ABC') \
      .reset_index(drop=True)

  ABC  0
0   A  1
1   A  2
2   A  3
3   B  4
4   B  5
5   B  6
6   C  7
7   C  8
8   C  9

or use df.melt as suggested by @MustafaAydın:

>>> df.melt('ABC') \
      .sort_values('ABC') \
      .drop(columns='variable') \
      .reset_index(drop=True)

  ABC value
0   A     1
1   A     2
2   A     3
3   B     4
4   B     5
5   B     6
6   C     7
7   C     8
8   C     9

edited Jun 29 '21 at 08:38

answered Jun 29 '21 at 08:32

Corralien

109,409
8
28
52

Many thanks! df.stack worked great (keeping the order of the columns). df.melt does not necessarily preserve the order if you have duplicate values in the ABC column I believe. – Fhtsm Jun 29 '21 at 09:20

Pandas DataFrame Wide to Long based on first column values

2 Answers2