1

I need to transform my table from a wide format to a long table. The table has measurements over time, let's say mass over time: m0, m1, m2 etc. so it looks like this:

ID | Age | m0 | m1 | m2 | m3
1    67    72   69   66   67
2    70    80   81   79   77
3    72    69   69   70   70

How I want it is:

ID | Age | time | m
1    67     0     72
1    67     1     69
1    67     2     66
1    67     3     67
2    70     0     80
2    70     1     81
2    70     2     79
2    70     3     77
...

I appreciate any help! Thank you in advance.

Cheers.

mzjn
  • 48,958
  • 13
  • 128
  • 248
  • 1
    What do you need help with specifically? Have you tried anything? – mzjn May 04 '21 at 08:13
  • Good question, I have tried the melt function, but end up with NaN values or distorted indexes. So I figured that I might have to fine tune the melt function better and that is what I need help with. – Ward Nieboer May 04 '21 at 09:08

1 Answers1

0

You can make use of pandas melt method in this case

result = df.melt(id_vars=['ID', 'Age'], value_vars=['m0', 'm1', 'm2', 'm3'])
result.columns = ['ID', 'Age', 'time', 'm']
result['time'] = result['time'].str.replace('m', '')
result = result.sort_values('Age').reset_index(drop=True)

print(result)
    ID  Age time   m
0    1   67    0  72
1    1   67    1  69
2    1   67    2  66
3    1   67    3  67
4    2   70    0  80
5    2   70    1  81
6    2   70    2  79
7    2   70    3  77
8    3   72    0  69
9    3   72    1  69
10   3   72    2  70
11   3   72    3  70

Alternative method using pd.wide_to_long

result = pd.wide_to_long(df, stubnames=["m"], i=["ID", "Age"], j="").reset_index()
result.columns = ['ID', 'Age', 'time', 'm']
result = result.sort_values('Age').reset_index(drop=True)
print(result)
    ID  Age  time   m
0    1   67     0  72
1    1   67     1  69
2    1   67     2  66
3    1   67     3  67
4    2   70     0  80
5    2   70     1  81
6    2   70     2  79
7    2   70     3  77
8    3   72     0  69
9    3   72     1  69
10   3   72     2  70
11   3   72     3  70

If there are more variables like m, one can mention it inside stubnames
pd.wide_to_long documentation : https://pandas.pydata.org/docs/reference/api/pandas.wide_to_long.html

imdevskp
  • 2,103
  • 2
  • 9
  • 23
  • Thank you for the solution! I have one more question, what if I have 8 variables like m, let's say one of them is height (h), and I want it as a column next to m, do I have to do this step 8 times? – Ward Nieboer May 04 '21 at 09:13
  • @WardNieboer you can either do `df.melt` for all 8 columns or you can make use of `pd.wide_to_long` as shown in the updated answer. here is the documentation as well https://pandas.pydata.org/docs/reference/api/pandas.wide_to_long.html – imdevskp May 04 '21 at 09:47