0

Given the following tidy pd.DataFrame:

import numpy as np
import pandas as pd

countries = np.tile(["A", "A", "A", "B", "B", "B", "C", "C", "C"], 2)
years = np.tile([2018, 2019, 2020], 6)
indicators = np.repeat(["ind_1", "ind_2"], 9)
vals = list(range(1, 19))

df = pd.DataFrame(data=[countries, years, indicators, vals]).T
df.columns = ["countries", "years", "indicators", "value"]
print(df)

   countries years indicators value
0          A  2018      ind_1     1
1          A  2019      ind_1     2
2          A  2020      ind_1     3
3          B  2018      ind_1     4
4          B  2019      ind_1     5
5          B  2020      ind_1     6
6          C  2018      ind_1     7
7          C  2019      ind_1     8
8          C  2020      ind_1     9
9          A  2018      ind_2    10
10         A  2019      ind_2    11
11         A  2020      ind_2    12
12         B  2018      ind_2    13
13         B  2019      ind_2    14
14         B  2020      ind_2    15
15         C  2018      ind_2    16
16         C  2019      ind_2    17
17         C  2020      ind_2    18

I am looking for a pyhtonic way to pivot df. The column index should have two levels: ["indicators", "years"]. That is, the result should look like this:

indicators ind_1           ind_2          
years       2018 2019 2020  2018 2019 2020
countries                                 
A              1    2    3    10   11   12
B              4    5    6    13   14   15
C              7    8    9    16   17   18

The below code will do it. However, it feels a bit strange. Is there a more sophisticated way to do this?

df = df.set_index(["indicators", "years"]).pivot(columns="countries").T
df = df.droplevel(level=0, axis=0)
Andi
  • 3,196
  • 2
  • 24
  • 44

0 Answers0