I have the right multi-index structure for the dataset, however I'm unable to fit this template to the dataset.
Importing the dataset:
data = pd.read_excel('IRCC_M_TRStudy_0001_E.xls')
code for multiindex columns:
years = (2015,2016,2017,2018,2019)
months = [
("Jan", "Feb", "Mar"),
("Apr", "May", "Jun"),
("Jul", "Aug", "Sep"),
("Oct", "Nov", "Dec"),
]
tuples = [(year, f"Q{i + 1}", month) for year in years for i in range(4) for month in months[i]]
multi_index = pd.MultiIndex.from_tuples(tuples)
My attempt to fit this template to the dataset:
df = pd.DataFrame(data, index = data['Country of Citizenship'], columns = multi_index)
the result:
consists of an index of 'countries of citizenship' and multi-index columns consisting of 3 levels - years(2015 - 2019), 4 Quarters for each year and 3 months per quarter (as expected). However, all the data is missing - all columns and rows show 'nan' values.
The expected results should look like this:
2015
Q1 Q2 Q3 Q4
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Country
USA 34 33 23 12 34 23 23 12 34 56 67 57
India 33 12 29 16 35 27 25 15 33 57 63 51
The above table repeats for years 2016,2017,2018,2019 from left to right. The data above is only for the purpose of representation, I want to fit the multi-index template to the dataset that consists of similar data. Also, how can I position the index 'country' a row below the row containing months as shown in the expected results?