Given the following tidy pd.DataFrame
:
import numpy as np
import pandas as pd
countries = np.tile(["A", "A", "A", "B", "B", "B", "C", "C", "C"], 2)
years = np.tile([2018, 2019, 2020], 6)
indicators = np.repeat(["ind_1", "ind_2"], 9)
vals = list(range(1, 19))
df = pd.DataFrame(data=[countries, years, indicators, vals]).T
df.columns = ["countries", "years", "indicators", "value"]
print(df)
countries years indicators value
0 A 2018 ind_1 1
1 A 2019 ind_1 2
2 A 2020 ind_1 3
3 B 2018 ind_1 4
4 B 2019 ind_1 5
5 B 2020 ind_1 6
6 C 2018 ind_1 7
7 C 2019 ind_1 8
8 C 2020 ind_1 9
9 A 2018 ind_2 10
10 A 2019 ind_2 11
11 A 2020 ind_2 12
12 B 2018 ind_2 13
13 B 2019 ind_2 14
14 B 2020 ind_2 15
15 C 2018 ind_2 16
16 C 2019 ind_2 17
17 C 2020 ind_2 18
I am looking for a pyhtonic way to pivot df
. The column index should have two levels: ["indicators", "years"]
. That is, the result should look like this:
indicators ind_1 ind_2
years 2018 2019 2020 2018 2019 2020
countries
A 1 2 3 10 11 12
B 4 5 6 13 14 15
C 7 8 9 16 17 18
The below code will do it. However, it feels a bit strange. Is there a more sophisticated way to do this?
df = df.set_index(["indicators", "years"]).pivot(columns="countries").T
df = df.droplevel(level=0, axis=0)