0

I have a zip file in S3 that I'm trying to return to the user. The end result is that when the user hits my API it should download this zip file on the user's computer.

Here's how I got the data from S3 (the data can also be retrieved by using s3.get_object)

@pyramid_view.view_config(route_name='route_name', renderer='string')
async def get_data(request):

    res = await(func(request))
    return res



async def func(request):
url = s3_client.generate_presigned_url('get_object',
                                           Params={'Bucket': DEFAULT_BUCKET_NAME,
                                                   'Key': path},
                                           ExpiresIn=3600)
    response = urllib.request.urlopen(url)
    finalData = response.read()
return finalData

Once I get finalData, I set the headers as

request.response.content_type = 'binary/octet-stream'
request.response.headers['Access-Control-Allow-Methods'] = 'POST, GET'
request.response.headers['Access-Control-Expose-Headers'] = 'X-filename'
request.response.headers['Content-Transfer-Encoding']='binary'
request.response.headers['X-filename'] = path
request.response.content_disposition = 'attachment;filename=try.zip'

and then return finalData from my API.

The file gets downloaded however when I try opening it, it says:

enter image description here

Does anyone have any clue why it might be happening? It seems it works fine for a flask app. I'm trying to make this work in an API gateway.

I tried to read the file that I downloaded via a hex editor and it shows this enter image description here

If I do responseFinal.decode(encoding="utf-8", errors="ignore") , the file downloaded now looks like this(still doesnt get unzipped). enter image description here The file that got downloaded successfully via presigned url in a hex editor looks like. Is there some encoding that I'm missing? (I have tried utf-8 decoding but that fails for the zip file) enter image description here

  • 1
    What are the contents of the invalid zip file? That may give you a hint. Try opening it with a hex editor. – orlp Jul 18 '20 at 22:22
  • Hey @orlp, the zip file contains a .json file and .csv file. If I download it via the presigned url(just hit the url in the browser) it downloads the file perfectly. I'm actually trying develop an API that returns this data to the user. – Gauri Dhawan Jul 18 '20 at 22:51
  • You misunderstood my comment. Look at the `try (1).zip` through a hex editor to see what it actually contains to get a hint as to what's wrong. – orlp Jul 18 '20 at 23:47
  • @orlp updated the post with the contents of try(1).zip and what it is expected to look like. – Gauri Dhawan Jul 19 '20 at 18:36
  • 1
    That looks like a Python display serialized representation of a binary string, and not the actual binary data. Show the code you are using to write it to a file. The problem is probably there. It looks like you took the binary data and did `str(b"data")`, which is the wrong way to output it. – user120242 Jul 19 '20 at 18:58
  • Does this answer your question? [Python writing binary](https://stackoverflow.com/questions/20955543/python-writing-binary) – user120242 Jul 19 '20 at 19:02
  • @user120242 I am not writing the data to any file, I simply return ```finalData``` from my API. I've also updated the post with how using ```decode('utf-8',errors=ignore)``` produces a result that is closer to actual result. Also added the completed method in the post. – Gauri Dhawan Jul 19 '20 at 19:22
  • You need to return binary data and not string data. Whatever is doing the output, is not letting you output binary data. encoding it as a unicode string will still cause (mangling) of data. – user120242 Jul 19 '20 at 19:26
  • @user120242 do you think using await or pyramid could be causing this issue? Because when I do types(finalData) it says bytes – Gauri Dhawan Jul 19 '20 at 19:55
  • probably. renderer=string would definitely not work. https://docs.pylonsproject.org/projects/pyramid/en/latest/api/response.html#pyramid.response.FileResponse is probably what you want – user120242 Jul 19 '20 at 20:29

1 Answers1

1

Figured it out. The issue was that the renderer in pyramid was converting the data to a string format. If we add the data to the response the renderer doesnt affect it. Thanks to @user120242 for guiding me in that direction.
I fixed it by creating a response object

async def func(request):
url = s3_client.generate_presigned_url('get_object',
                                           Params={'Bucket': DEFAULT_BUCKET_NAME,
                                                   'Key': path},
                                           ExpiresIn=3600)
    response = urllib.request.urlopen(url)
    finalData = response.read()
request.response.content_type = 'binary/octet-stream'
    request.response.headers['Access-Control-Allow-Methods'] = 'POST, GET'
    request.response.headers['Access-Control-Expose-Headers'] = 'X-filename'
    request.response.headers['Content-Transfer-Encoding'] = 'binary'
    request.response.headers['X-filename'] = path
    request.response.content_disposition = 'attachment;filename=try.zip'

request.response.body = finalData
return request.response