Parsing large JSON file, and download URL's of every object in Python

Question

In Python I'm trying to download every single URL which is contained in a 180 MB JSON file. Even though it is only 180 MB, when I'm trying to open it with text-editor it uses 5.9 GB memory.

So Jupyter is crashing when I try to read the JSON and extract the URL's inside.

Here is a sample from JSON file.

{"company name": "ZERO CORP", "cik_number": "109284", "form_id": "10-K", "date": "19940629", "file_url": "https://www.sec.gov/Archives/data/109284/0000898430-94-000468.txt"}
{"company name": "FOREST LABORATORIES INC", "cik_number": "109563", "form_id": "10-K", "date": "19940628", "file_url": "https://www.sec.gov/Archives/data/38074/0000038074-94-000021.txt"}
{"company name": "GOULDS PUMPS INC", "cik_number": "14637", "form_id": "10-K", "date": "19940331", "file_url": "https://www.sec.gov/Archives/data/42791/0000042791-94-000002.txt"}
{"company name": "GENERAL HOST CORP", "cik_number": "275605", "form_id": "10-Q", "date": "19940701", "file_url": "https://www.sec.gov/Archives/data/40638/0000950124-94-001209.txt"}

Solutions that I think might work:

1) I think I'm going to need some kind of memory management to iterate over all the file_url's and download them in Python.

2) I can which to JavaScript and use node.js to do this iteration asyc but I never used JavaScript or node.js before.

possible duplicate of [Loading and parsing a JSON file with multiple JSON objects](https://stackoverflow.com/questions/12451431/loading-and-parsing-a-json-file-with-multiple-json-objects) — tidakdiinginkan, Apr 17 '20 at 17:21

Parsing large JSON file, and download URL's of every object in Python

0 Answers0