0

I am trying to convert yelp challenge dataset from json to csv format using pandas. My session crashed with memory error. like Not enough memory. I am using google colab high RAM. My code works for other files except yelp_academic_dataset_review.json file. Following is my code sample. Can anyone suggest me any solution? Thanks

import pandas as pd
df = pd.read_json('/content/drive/MyDrive/Data/yelp_academic_dataset_review.json', lines=True)

df.to_csv('/content/drive/MyDrive/Data/yelp_review.csv', index = None)
Emrul
  • 51
  • 1
  • 4
  • If the incoming file is one json object per line a [solution like in the top answer here](https://stackoverflow.com/questions/42169287/python-performance-tuning-json-to-csv-big-file) would be appropriate. – JNevill Jul 07 '21 at 14:05
  • Does this answer your question? [How do I read a large csv file with pandas?](https://stackoverflow.com/questions/25962114/how-do-i-read-a-large-csv-file-with-pandas) The solution to use `chunksize` should apply to JSON as well. – fsimonjetz Jul 07 '21 at 14:12
  • Sorry I didn't get that. It's one file named yelp_academic_dataset_review.json @JNevill – Emrul Jul 07 '21 at 14:15

0 Answers0