I am trying to load the json file to pandas data frame. I found that there were some nested json. Below is the sample json:
{'events': [{'id': 142896214,
'playerId': 37831,
'teamId': 3157,
'matchId': 2214569,
'matchPeriod': '1H',
'eventSec': 0.8935539999999946,
'eventId': 8,
'eventName': 'Pass',
'subEventId': 85,
'subEventName': 'Simple pass',
'positions': [{'x': 51, 'y': 49}, {'x': 40, 'y': 53}],
'tags': [{'id': 1801, 'tag': {'label': 'accurate'}}]}
I used the following code to load json into dataframe:
with open('EVENTS.json') as f:
jsonstr = json.load(f)
df = pd.io.json.json_normalize(jsonstr['events'])
Below is the output of df.head()
But I found two nested columns such as positions and tags.
I tried using the following code to flatten it:
Position_data = json_normalize(data =jsonstr['events'], record_path='positions', meta = ['x','y','x','y'] )
It showed me an error as follow:
KeyError: "Try running with errors='ignore' as key 'x' is not always present"
Can you advise me how to flatten positions and tags ( those having nested data).
Thanks, Zep