3

My data set is in form of:

enter image description here

I want to convert it into :

enter image description here

How can I do it in Python using pandas?

It solved thanks and appreciate your time to help!!! +1 for all

Tell Me How
  • 672
  • 12
  • 16

4 Answers4

3

You can use pandas.melt without specifying value_vars

If not specified, uses all columns that are not set as id_vars.

df.melt(id_vars='name', var_name='year').sort_values('name')

  name  year  value
0  abc  2016      1
2  abc  2017      2
4  abc  2018      5
6  abc  2019      9
1  def  2016      5
3  def  2017      8
5  def  2018      8
7  def  2019      4
henrywongkk
  • 1,840
  • 3
  • 17
  • 26
2

try this:

pd.melt(df, id_vars=['name'], value_vars=['2016', '2017', '2018',"2019"],var_name='year', value_name='value').sort_values('name')

Output:


+----+-------+-------+-------+
|    | name  | year  | value |
+----+-------+-------+-------+
| 0  | abc   | 2016  |     1 |
| 2  | abc   | 2017  |     2 |
| 4  | abc   | 2018  |     5 |
| 6  | abc   | 2019  |     9 |
| 1  | def   | 2016  |     5 |
| 3  | def   | 2017  |     8 |
| 5  | def   | 2018  |     8 |
| 7  | def   | 2019  |     4 |
+----+-------+-------+-------+

M_S_N
  • 2,764
  • 1
  • 17
  • 38
1

You can use

a = df.columns[1:]
df.melt(id_vars='name',value_vars = a,var_name='year').sort_values('name')
moys
  • 7,747
  • 2
  • 11
  • 42
1
  • pandas.melt is the easiest approach to convert your dataframe to a tidy format.
    • Unpivot a DataFrame from wide format to long format, optionally leaving identifier variables set.
  • An alternate approach is pandas.wide_to_long
    • Wide panel to long format. Less flexible but more user-friendly than melt.
import pandas as pd

# create dataframe
df = pd.DataFrame({'name': ['abc', 'def'],
                   '2016': [1, 5],
                   '2017': [2, 8],
                   '2018': [5, 8],
                   '2019': [9, 4]})

name  2016  2017  2018  2019
 abc     1     2     5     9
 def     5     8     8     4

# melt df
df_melt = df.melt(id_vars='name', value_vars=['2016', '2017', '2018', '2019'])

name variable  value
 abc     2016      1
 def     2016      5
 abc     2017      2
 def     2017      8
 abc     2018      5
 def     2018      8
 abc     2019      9
 def     2019      4
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158