I want to create a CSV file out of a very complex dict. The real dict uses thousands of keys and more than 9 levels of depth, but this is just a mere example of the structure:
import pandas
my_stuff = [
{
"a":
[
{"1": "example1"},
{"2": [
{"2": "example2"},
{"3": "example3"}
]},
{"4": "example4"},
{"5": "example5"}
],
"b":
[
"example6", "61", "62"
]
}
]
result = pandas.json_normalize(my_stuff)
print(result.to_csv())
That prints:
,a,b 0,
"[{'1': 'example1'}, {'2': [{'2': 'example2'}, {'3': 'example3'}]}, {'4': 'example4'}, {'5': 'example5'}]","['example6', '61', '62']"
But I want this output:
"0.a.0.1, 0.a.0.2.2, 0.a.0.2.3, 0.a.0.4, 0.a.0.5, 0b.0"
"example1, example2, example3, example4, example5, example6;61;62"
I though pandas would be able to flatten the dict but seems like it can not. I need the keys to be used as headers like sectiona.subsection1.fieldwhatever
because that .csv will be later loaded into a database.
I hope anyone can help.
Bonus: I tried without pandas but got stuck here:
def flatten(py_structure, depth=""):
"""make a flatten dict"""
new_dict = {}
if isinstance(py_structure, dict):
for k, v in py_structure.items():
if isinstance(v, dict):
flattened_v = flatten(v, k)
elif isinstance(v, list):
flattened_v = flatten(v, k)
else:
flattened_v = v
new_dict[f"{depth}{k}"] = flattened_v
return new_dict
elif isinstance(py_structure, list):
for idx, v in enumerate(py_structure):
new_dict[f"{depth}{idx}"] = flatten(v, f"{depth}{idx}")
return new_dict