I am trying to read from a streaming API where the data is sent using Chunked Transfer Encoding. There can be more than one record per chunk, each record is separated by a CRLF. And the data is always sent using gzip compression. I am trying to get the feed and then do some processing at a time. I have gone through a bunch of stackOverflow resources but couldn't find a way to do it in Python. the iter_content(chunk) size in my case is throwing an exception on the line.
for chunk in api_response.iter_content(chunk_size=1024):
In Fiddler (which I am using as a Proxy) I can see that data is being constant downloaded and doing a "COMETPeek" in Fiddler, I can actually see some sample json.
Even iter_lines does not work. I have looked at asyncio and aiohttp case mentioned here: Why doesn't requests.get() return? What is the default timeout that requests.get() uses?
but not sure how to do the processing. As you can see I have tried using bunch of python libraries. Sorry some of the code might have some libraries that I later removed from usage as it didn't work out.
I have also looked at the documentation for requests library but couldn't find anything substantial.
As mentioned above, below is a sample code of what I am trying to do. Any pointers to how I should proceed would be highly appreciated.
This is the first time I am trying to read a stream
from oauthlib.oauth2 import BackendApplicationClient
from requests_oauthlib import OAuth2Session
import requests
import zlib
import json
READ_BLOCK_SIZE = 1024*8
clientID="ClientID"
clientSecret="ClientSecret"
proxies = {
"https": "http://127.0.0.1:8888",
}
client = BackendApplicationClient(client_id=clientID)
oauth = OAuth2Session(client=client)
token = oauth.fetch_token(token_url='https://baseTokenURL/token', client_id=clientID,client_secret=clientSecret,proxies=proxies,verify=False)
auth_t=token['access_token']
#auth_t = accesstoken.encode("ascii", "ignore")
headers = {
'authorization': "Bearer " + auth_t,
'content-type': "application/json",
'Accept-Encoding': "gzip",
}
dec=zlib.decompressobj(32 + zlib.MAX_WBITS)
try:
init_res = requests.get('https://BaseStreamURL/api/1/stream/specificStream', headers=headers, allow_redirects=False,proxies=proxies,verify=False)
if init_res.status_code == 302:
print(init_res.headers['Location'])
api_response = requests.get(init_res.headers['Location'], headers=headers, allow_redirects=False,proxies=proxies,verify=False, timeout=20, stream=True,params={"smoothing":"1", "smoothingBucketSize" : "180"})
if api_response.status_code == 200:
#api_response.raw.decode_content = True
#print(api_response.raw.read(20))
for chunk in api_response.iter_content(chunk_size=api_response.chunk_size):
#Parse the response
elif init_res.status_code == 200:
print(init_res.content)
except Exception as ce:
print(ce)
UPDATE I am looking at this now: https://aiohttp.readthedocs.io/en/v0.20.0/client.html
Would that be the way to go?