I have a pandas dataframe that has monthly counts at various hierarchical levels. It is in long format, and I want to convert to wide format, with columns for each level of aggregation.
It is of the following format:
date | country | state | county | population
01-01| cc1 | s1 | c1 | 5
01-01| cc1 | s1 | c2 | 4
01-01| cc1 | s2 | c1 | 10
01-01| cc1 | s2 | c2 | 11
02-01| cc1 | s1 | c1 | 6
02-01| cc1 | s1 | c2 | 5
02-01| cc1 | s2 | c1 | 11
02-01| cc1 | s2 | c2 | 12
.
.
Now I want to transform this into the following format:
date | country_pop| s1_pop | s2_pop| .. | s1_c1_pop | s1_c2_pop| s2_c1_pop | s2_c2_pop|..
01-01| 30 | 9 | 21 | ...| 5 | 4 | 10 | 11 |..
02-01| 34 | 11 | 23 | ...| 6 | 5 | 11 | 12 |..
.
.
The total number of states is, 4, s1....s4.
And the counties in each state can be labelled c1.... c10 (some states might have less, and I want those columns to be zeros.)
I want to get a time series at each level of aggregation, ordered by the date. How do I get this ?