I am trying to receive only partial data from a request. The server does not respond to the Range
header.
I have attempted the following:
def get_data(change_id):
url = "https://my-api.com?id={}".format(change_id)
r = requests.get(url, stream=True, headers=headers)
for data in r.iter_content(chunk_size=512):
return extract_change_id(data)
This still completes the full request as far as I can see.
The request returns a new change id which gives access to the next request as a river. The idea is to read the first few bytes and extract the change id from the body since the change id appears first in every body and then pass it off to a new thread to be processed. Each request is upwards of 5MBs and needs to be handled concurrently to stay up to date with the river. Simply reading the whole request and then parsing to json is too slow.
The extract function is a simple double regex find.
def extract_change_id(data):
try:
full_change_id = full_change_id_regex.search(data).group(0)
change_id = change_id_regex.search(full_change_id).group(0)
return change_id
except Exception as e:
return None