I have a mongoDB collection with documents like this one
doc = {
"_id": {
"$oid": "516622c9ce21150200000d87"
},
"SubmissionDate": {
"$date": "2013-04-11T02:41:13.162Z"
},
"isComplete": True,
"Rounds": [
{
"Photo": [
],
"A": {
"Complexity": 55,
"Colour": 85,
"Deep": 51,
"Effervescence": 44
},
"B": {
"QualityPIDs": [
],
"QualityScales": [
],
"Complexity": 43,
"Qualities": [
]
},
"C": {
"QualityPIDs": [
],
"QualityScales": [
],
"Complexity": 60,
"UHS": 46,
"Colour": 33,
"Qualities": [
]
},
"D": {
"Complexity": 73,
"Duration": 68,
"Quality": 65
}
}
],
"Item": {
"_id": {
"$oid": "51e6d678c06918db21156f92"
},
"Country": "Australia",
"Name": "King",
"PeopleId": {
"$oid": "51dddb69a9d9350200000"
},
"Style": "Apple",
"Type": "Flat",
"UserSubmitted": False
}
}
I need to convert this collection into pandas dataframe.
Solution suggested here How to import data from mongodb to pandas? does the main job. But I still have Rounds column with a dict of dictionaries inside.
I did a set of loops in order to access subdictionaries of Rounds
df = pd.json_normalize(doc)
A_data = pd.DataFrame(columns=df.Rounds[0][0]['A'].keys())
for i in range(len(df.Rounds)):
A_data = A_data.append(pd.json_normalize(df.Rounds[0][0]['A']), ignore_index=True)
And finally I concat A_data to my main data frame.
Is there a faster way to do it? Right now loop takes to much time. Thank you!