0

I have some data loaded in python with pandas. I also have the following data frame structure

Country Year Subject Value
‘USA’   1991  ‘GDP’  NUM1
‘USA’   1992  ‘GDP’  NUM2
‘USA’   1993  ‘GDP’  NUM3
‘USA’   1991  ‘INV’  NUM4
‘USA’   1992  ‘INV’  NUM5
‘USA’   1993  ‘INV’  NUM6

I want to turn this format into this more standard panel data structure

Country Year   GDP   INV
‘USA’   1991  NUM1  NUM4
‘USA’   1992  NUM2  NUM5
‘USA’   1993  NUM3  NUM6

Can anybody help me?

Thank you in advance

harvpan
  • 8,571
  • 2
  • 18
  • 36
Jorge Alonso
  • 103
  • 11

1 Answers1

0
df:

Country Year    Subject Value
0   USA 1991    GDP     10
1   USA 1992    GDP     20
2   USA 1993    GDP     30
3   USA 1991    INV     40
4   USA 1992    INV     50
5   USA 1993    INV     60

You need pivot_table.

pv = df.pivot_table(index=['Country', 'Year'], columns=['Subject'], values=['Value']).reset_index()
pv.columns = [col[0] if col[1]=='' else col[1] for col in pv.columns.values ]

Output:

    Country Year    GDP INV
0   USA     1991    10  40
1   USA     1992    20  50
2   USA     1993    30  60
harvpan
  • 8,571
  • 2
  • 18
  • 36