0

I have a pandas dataframe with unique values in ID column.

df = pd.DataFrame({'ID': ['A', 'B', 'C'],
                   'STAT': ['X', 'X', 'X'],
                   'IN1': [1, 3, 7],
                   'IN2': [2, 5, 8],
                   'IN3': [3, 6, 9]})

I need to create a new dataframe where I have a row for each value in IN1, IN2 and IN3 with corresponding ID and STAT:

df_new = pd.DataFrame({'IN': [1, 2, 3, 3, 5, 6, 7, 8, 9],
                       'ID': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
                       'STAT': ['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X']})
Martin
  • 41
  • 5

2 Answers2

1

You can use pandas.wide_to_long:

(pd.wide_to_long(df, ['IN'], j='to_drop', i='ID')
   .droplevel('to_drop')
   .sort_index()
   .reset_index()
)

output:

  ID STAT  IN
0  A    X   1
1  A    X   2
2  A    X   3
3  B    X   3
4  B    X   5
5  B    X   6
6  C    X   7
7  C    X   8
8  C    X   9
mozway
  • 194,879
  • 13
  • 39
  • 75
0

You can use melt

df.melt(id_vars=['ID','STAT'], value_name='IN')

Gives:

ID  STAT    variable    IN
0   A   X   IN1 1
1   B   X   IN1 3
2   C   X   IN1 7
3   A   X   IN2 2
4   B   X   IN2 5
5   C   X   IN2 8
6   A   X   IN3 3
7   B   X   IN3 6
8   C   X   IN3 9

To make the df into a row:

(df.melt(id_vars=['ID','STAT'], value_name='IN')
   .sort_values(by='ID')
   .drop('variable', axis=1)
)

Gives the exact same results.

mozway
  • 194,879
  • 13
  • 39
  • 75
Mohammad
  • 3,276
  • 2
  • 19
  • 35