I have a huge list of jsons (3.23 million jsons). I want to normalize this list and convert it to dataframe. I will end up getting 400 fields after normalizing. I am able to do the above steps(normalize, dataframe) for few thousands of json, but not the entire list.
Here is how I got the list - Going through all the .json files in the folder and appending each and every json to the empty list data_full=[]
`data_full=[]
path ="a/b/c"
for file in os.listdir(path):
full_path = path+'/'+str(file)
with open(full_path) as f:
for line in f:
data_full.append(json.loads(line))`
Given the size of the list, I want to divide the list into '35' equal parts and create new dataframe for each part (df_1, df_2.. df_35). After searching a lot, I could find - how to convert huge list to a single list of list(chunks), and how to convert a list to dataframe, but could not find a way to convert a huge list to multiple new list and convert each one of the list to a new dataframe. The last bit is in italics because I think once I get 35 new lists I can convert them to a new dataframe easily.
So, the question is how do I split this huge list to 35 new lists. If you have any other approach/suggestions to process 3.23 million json to perform some NLP techniques, I would appreciate that too.
Thanks in advance