I need to read a really big file of jsonl's from a URL the approach I am using is as follow
bulk_status_info = _get_bulk_info(shop)
url = bulk_status_info.get('bulk_info').get('url')
file = urllib.request.urlopen(url)
for line in file:
print(json.loads(line.decode("utf-8")))
However, my CPU and memory are limited so that brings me to two questions
- Is the file loaded all at once or is it have some buffering mechanism to prevent memory from overflowing.
- In case my task failed I want to start from the place I failed. Is there some sort of cursor I can save. Note things like seek or tell do not work here since it is not an actual file
Some additional info I am using Python3 and urllib