I have read the answers in this question and the next one. I have 2 huge JSON files: the first one is players_data.JSON (8GB) and the second one is match_data.JSON (80GB). The structure of data is complex, because JSON package gives a dictionary that in this case the values could be list or dictionary. Some values in dictionary of dictionaries or list of dictionaries. So the structure includes multiple nested dictionaries.
My questions are as bellow:
- What is the best Python package for this case to parse the JSON files.
- What is the best data structure to process the data. I will need to compute statistics for each player and each match (6 players)? For instance, using a dictionary with complex keys (playerId, matchId) could be an option.