I am working on an interactive visualization of the world happiness report from the years 2015 up to 2020. The data was split into 6 csv files. Using pandas, I have succesfully cleaned the data and concatenated them into one big JSON file with the following format:
[
{
"Country": "Switzerland",
"Year": 2015,
"Happiness Rank": 1,
"Happiness Score": 7.587000000000001,
},
{
"Country": "Iceland",
"Year": 2015,
"Happiness Rank": 2,
"Happiness Score": 7.561,
},
{
"Country": "Switzerland",
"Year": 2016,
"Happiness Rank": 2,
"Happiness Score": 7.5089999999999995,
},
{
"Country": "Iceland",
"Year": 2016,
"Happiness Rank": 3,
"Happiness Score": 7.501,
},
{
"Country": "Switzerland",
"Year": 2017,
"Happiness Rank": 3,
"Happiness Score": 7.49399995803833,
},
{
"Country": "Iceland",
"Year": 2017,
"Happiness Rank": 1,
"Happiness Score": 7.801,
}
]
Now, I would like to programmatically format the JSON file such that it has the following format:
{
"2015": {
"Switzerland": {
"Happiness Rank": 1,
"Happiness Score": 7.587000000000001
},
"Iceland": {
"Happiness Rank": 2,
"Happiness Score": 7.561
}
},
"2016": {
"Switzerland": {
"Happiness Rank": 2,
"Happiness Score": 7.5089999999999995
},
"Iceland": {
"Happiness Rank": 3,
"Happiness Score": 7.501
}
},
"2017": {
"Switzerland": {
"Happiness Rank": 3,
"Happiness Score": 7.49399995803833
},
"Iceland": {
"Happiness Rank": 1,
"Happiness Score": 7.801
}
}
}
It has to be done programmatically, since there are over 900 distinct (country, year) pairs. I want the JSON in this format since it make the JSON file more readable, and makes it easier to select appropriate data. If I want the rank of Iceland in 2015, I can then do data[2015]["Iceland"]["Happiness Rank"]
Does anyone know the easiest / most convenient way to do this in Python?