Why does download_file throw a task timeout error using Boto3's S3 client?

Question

I am developing a Python Lambda function.

The documentation suggests that we can download files like this:

s3.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME')

I have a bucket and a zip file inside the bucket. So what do I put as the object name when there's no folder?

I tried these:

s3.download_file('testunzipping','DataPump_10000838.zip','DataPump_10000838.zip')

s3.download_file('testunzipping','DataPump_10000838.zip')

But I get a time-out error in both cases.

  "errorMessage": "2021-10-17T14:51:34.889Z 4257cbc1-2dd0-4fb9-b147-0dffce1f97a1 Task timed out after 3.06 seconds"

However, this works just fine:

lst = s3.list_objects(Bucket='testunzipping')['Contents']

There also doesn't seem to be any permission issues as the Lambda's execution role has a policy giving it the s3:GetObject permission:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ExampleStmt",
            "Action": [
                "s3:GetObject"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::testunzipping"
            ]
        }
    ]
}

The role also has S3FullAccess.

What is the issue?

List objects is different from download object - does your Lambda execution role have the `s3:getObject` permission? — Ermiya Eskandary, Oct 17 '21 at 14:51
How big is the ZIP file? Increase your Lambda function timeout and optionally RAM size. — jarmod, Oct 17 '21 at 14:52
What is the output of `lst = s3.list_objects(Bucket='testunzipping')['Contents']`? — Ermiya Eskandary, Oct 17 '21 at 15:08
```lst [{'Key': 'DataPump_10000838.zip', 'LastModified': datetime.datetime(2021, 10, 17, 12, 44, 28, tzinfo=tzlocal()), 'ETag': '"1e0d82d1018480f6aaae5efdd01d1319"', 'Size': 17267, 'StorageClass': 'STANDARD'}]```@ErmiyaEskandary — x89, Oct 17 '21 at 15:11

Ermiya Eskandary · Accepted Answer · 2021-10-17T15:44:43.830

0

Your task is timing out as the default Lambda execution timeout is 3 seconds & the download_file method is taking longer than 3 seconds.

Go into the general configuration settings for the function and increase the timeout to 10 seconds, which should be plenty of time for downloading a 17kb file.

With that fixed, you still won't be able to download the file as you'll get a [Errno 13] Permission denied error.

In Lambda functions, you must download the file to the /tmp directory as that is the only available file system that AWS permits you to write to (and read from).

s3.download_file('testunzipping','DataPump_10000838.zip','/tmp/DataPump_10000838.zip')

The /tmp directory also has a fixed size of 512MB so keep that in mind if downloading larger objects.

edited Oct 17 '21 at 15:44

answered Oct 17 '21 at 15:31

Ermiya Eskandary

15,323
3
31
44

1

You can directly stream an object into a Python Stream as well - by using s3.get_object(bucket, key)["body"] in place of any other stream in python you normally would. – lynkfox Oct 17 '21 at 17:24
Yes but `download_file` is what is mentioned in the question :) – Ermiya Eskandary Oct 17 '21 at 17:30
@lynkfox my final aim is to unzip the file. Could you explain how your alternative would be different? Maybe that's more suitable – x89 Oct 17 '21 at 18:34
@x89 https://stackoverflow.com/a/3451150/4800344 – Ermiya Eskandary Oct 17 '21 at 18:35
1

Yes, I already checked that! :) – x89 Oct 17 '21 at 18:36
The zip libraries in Python work from a stream. When you use `load` to load a file into your python script/module you are starting a stream of data that is then run through the unzip. If you originate thst stream with `get_object` instead you can, usually, just put it through the zip libraries as well. You can treat `s3.get_object` as an aws replacement for load – lynkfox Oct 17 '21 at 19:50

Why does download_file throw a task timeout error using Boto3's S3 client?

1 Answers1