I am downloading large .zip extension files from a restapi URL endpoint(which is a corporate private URL) using requests in python and the files downloaded end up being downloaded for 1KB and are thus invalid. 1. I have used tried urllib library too for the same but ended up in the same 1Kb file downloads which are invalid. 2. The URL endpoint downloads the file when the URL is hit from the browser but does not download using python code.
- I have tried using the urllib library in python which did not work.
- I have used clint library too but that too didn't work.
with open(dir+"/"+dict['title'], 'wb') as dumpstate_file:
print("downloading")
for chunk in response.iter_content(chunk_size=4096):
if(chunk):
dumpstate_file.write(chunk)
with open(dir+"/"+dict['title'], 'wb') as out:
total_length = int(r.headers.get('content-length'))
for ch in progress.bar(r.iter_content(chunk_size = 2391975), expected_size=(total_length/1024) + 1):
if ch:
out.write(ch)
r = http.request('GET', url, preload_content=False)
print("downloading")
with open(dir+"/"+dict['title'], 'wb') as out:
while True:
data = r.read(8192)
if not data:
break
out.write(data)
There are no error messages. The problem here is that the files downloaded here are of just 1KB and invalid.
UPDATE
I found a solution to this issue. I ran a curl request for the same in my anaconda prompt and the zip file downloaded successfully.
This is the curl request I ran:
curl --data-urlencode "singleId=alok3.singh" --data-urlencode "appId=SRI_N_TOOL" --data-urlencode "userLang=KO" --data-urlencode "serviceCode=GET_FILE" --data-urlencode "divisionCode=00" --data-urlencode "param={'divisionCode':'25','docId':'00DFOJ57StPMWL1000','title':'CallerId[1].zip','fileId':'00DFOJ57LtPMWL1000'}" -o CallerId[1].zip http://{private company url}/fileapi/getFile.do
I am not sure how to use this particular piece of code in my python script as i need to automate these downloads.
FURTHER UPDATE
I used python's subprocess function to use this curl request I have mentioned above in the update.
Here is the code for the subprocess call: This below code works completely fine and downloads the complete file
subprocess.call(["curl", "--data-urlencode", "singleId=alok3.singh", "--data-urlencode", "appId=SRI_N_TOOL", "--data-urlencode", "userLang=KO", "--data-urlencode", "serviceCode=GET_FILE", "--data-urlencode", "divisionCode=00", "--data-urlencode", "param={'divisionCode':'25','docId':'00DF','title':'hello.zip','fileId':'tPMWL'}","-o","hello.zip","https://privateurl"])
But there is a catch in this: When I replace the docId with a string variable which contains the docId, the download doesn't work.
docId='00DF'
subprocess.call(["curl", "--data-urlencode", "singleId=alok3.singh", "--data-urlencode", "appId=SRI_N_TOOL", "--data-urlencode", "userLang=KO", "--data-urlencode", "serviceCode=GET_FILE", "--data-urlencode", "divisionCode=00", "--data-urlencode", "param={'divisionCode':'25','docId':docId,'title':'hello.zip','fileId':'tPMWL'}","-o","hello.zip","https://privateurl"])
SOLUTION
The requirement here needed here was to use json.dumps() and json.loads() for the params to extract the variables.
param = json.dumps({
'divisionCode':'25',
'docId':docId,
'title':title,
'fileId':title
})
args = ["curl", "--data-urlencode", "singleId=alok3.singh", "--data-urlencode", "appId=SRI_N_TOOL", "--data-urlencode", "userLang=KO", "--data-urlencode", "serviceCode=GET_FILE", "--data-urlencode", "divisionCode=00", "--data-urlencode", "param={0}".format(json.loads(param)),"-o","CallerId[1].zip","http://private url"]
subprocess.call(args)