i would like to transform some json file or dict object in a table ready for sql queries.
I have already found several great topics but none of them seem to answer my question:
Python flatten multilevel/nested JSON
Flatten nested dictionaries, compressing keys
In my case i have json files which stack every occurence with the same structure in a list like this:
[{
"_id": "some_id_value",
"time": {
"$date": "2019-12-28"
},
"boolean_key": false,
"data_stacked": [{
"name": "task1",
"time": {
"$date": "2019-12-28"
},
"time_spent": 2.2,
"size": {
"$number": "0"
},
"other": {}
}, {
"name": "task2",
"time": {
"$date": "2019-12-28"
},
"time_spent": 2.7,
"size": {
"$number": "0"
},
"other": {}
}]
},
{
"_id": "some_other_id_value",
"time": {
"$date": "2019-12-30"
},
"boolean_key": true,
"data_stacked": [{
"name": "task3",
"time": {
"$date": "2019-12-30"
},
"time_spent": 3.4,
"size": {
"$number": "0"
},
"other": {}
}, {
"name": "task4",
"time": {
"$date": "2019-12-30"
},
"time_spent": 4.1,
"size": {
"$number": "0"
},
"other": {}
}]
}]
and i would like to iterate on the first keys which represent standard attribute
[
{
"_id": "some_id_value",
"time.$date": "2019-12-28"
"boolean_key": false,
"data_stacked.name": "task1",
"data_stacked.time.$date": "2019-12-28"
"data_stacked.time_spent": 2.2,
"data_stacked.size.$number": "0"
"data_stacked.other": {}
},
{
"_id": "some_id_value",
"time.$date": "2019-12-28"
"boolean_key": false,
"data_stacked.name": "task2",
"data_stacked.time.$date": "2019-12-28"
"data_stacked.time_spent": 2.7,
"data_stacked.size.$number": "0"
"data_stacked.other": {}
},
{
"_id": "some_other_id_value",
"time.$date": "2019-12-30"
"boolean_key": true,
"data_stacked.name": "task3",
"data_stacked.time.$date": "2019-12-30"
"data_stacked.time_spent": 3.4,
"data_stacked.size.$number": "0"
"data_stacked.other": {}
},
{
"_id": "some_other_id_value",
"time.$date": "2019-12-30"
"boolean_key": true,
"data_stacked.name": "task4",
"data_stacked.time.$date": "2019-12-30"
"data_stacked.time_spent": 4.1,
"data_stacked.size.$number": "0"
"data_stacked.other": {}
}]
I have found a similar case here: https: How to flatten up embedded JSON into multiple documents
But i didn't succeed to run this example, i have never used js before so i don't know what is missing.
On my last attempt, i tried to loop around keys:
tryjson = json.load(testjson)
def flatten_json2(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
import pprint
for i in tryjson:
test=(i['_id'] ,flatten_json2(i['data_stacked']))
print(test)
the result isn't good, _id key is missing because i looked for _id value only (i don't know how to return key and value for a specific field), data stacked seems to be sorted correctly but the whole object is a tuple so i can't use json_normalize either
Do you have a solution for this issue?
best regards.