I’m trying to iterate through query results in nested ordered dictionaries returned from the Salesforce REST API and convert them into a pandas DataFrame object. I'm having trouble with this due to the inconsistent nesting of the dictionaries being returned.
For example, I have a list of OrderedDict objects, each of which contains an ‘Id’, ‘Email’, and another ‘attributes’ OrderedDict object consisting of ‘url’, and ‘type’:
[OrderedDict([(u'attributes', OrderedDict([(u'type', u'someType'), (u'url', u'/someurl')])), (u'Id', u'1A'), (u'Email', u'xxxx@xxxx.com')]),…]
What I would like to convert this to is a list of OrderedDict objects where all the data fields are on the same level (i.e., no nested OrderedDict objects) and the data contained within the ‘attributes’ OrderedDict has been brought up one level :
[OrderedDict([(u'type', u'someType'), (u'url', u'/someurl'), (u'Id', u'1A'), (u'Email', u'xxxx@xxxx.com')]),…]
Ideally, I'd like to be able to format the data into a long format such as this for any number of levels of nesting, as it would help me parse the data into a more easily workable format (i.e., pandas DataFrames). I come from an R background, so I am much more comfortable manipulating DataFrames than nested dictionaries.
I've been playing around with a number of posted solutions to similar problems, but each seem to address a slightly different use case than my own. The issue is that there is no consistency in the number of levels of nested dictionaries in my returned data. If I simply try to convert my current list into a pandas DataFrame, I get the following:
import pandas as pd
pd.DataFrame(salesRecords).head()
Email Id
0 some@email.com 1A
1 some@email.com 1B
2 some@email.com 1C
3 some@email.com 1D
4 some@email.com 1E
attributes
0 {u'type': u'someType', u'url': u'someurl...
1 {u'type': u'someType', u'url': u'someurl...
2 {u'type': u'someType', u'url': u'someurl...
3 {u'type': u'someType', u'url': u'someurl...
4 {u'type': u'someType', u'url': u'someurl...
In this example, I would like to 'flatten' the attributes column into two columns containing the attributes data ('type' and url'). If anybody would be willing to point me in the right direction, it would be greatly appreciated.
Thanks a lot!