0

I have a zip file on S3. I need to use /tmp folder of a Lambda for lazy-loading that I am using to store the zip file and then I need the folder to contain the unzipped contents. I am using Python for the operations. Below is the code:

import sys
import zipfile
import subprocess

s3 = boto3.resource('s3')
s3.meta.client.download_file('bucket', 'dependencies/dependencies.zip', '/tmp/dependencies.zip')

output = subprocess.check_output('unzip /tmp/dependencies.zip -d /tmp/')
print("Output: ", output.decode('utf-8'))

I am getting the error: No such file or directory: 'unzip /tmp/dependencies.zip -d /tmp/': 'unzip /tmp/dependencies.zip -d /tmp/' Unknown application error occurred

I have my code outside the lambda_handler() for cold-start. From this I not even sure that the zipped file has been copied to /tmp directory since that line didn't throw any error, and also I am not able to unzip the contents. I tried removing the / in front of tmp and that didn't help either. Where am I going wrong?

2 Answers2

2

First thing is to confirm that the file has been downloaded (How to Check if a File Exists in Python – dbader.org).

The problem might be that the sub-process can't find unzip.

I notice you have imported zipfile, so you could that library to unzip. From Unzipping files in python:

import zipfile
with zipfile.ZipFile('/tmp/dependencies.zip', 'r') as zip_ref:
    zip_ref.extractall('/tmp/')
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
0

The TMP folder has limited memory. Hence, I would recommend to avoid using that method. Instead, you can read the zipped file into a buffer, extract all filenames using the zipfile library. Then, you read the content of each file within the zipped file and re-upload it to S3.

        zipped_file = s3_resource.Object(bucket_name=sourcebucketname, key=filekey)
        buffer = BytesIO(zipped_file.get()["Body"].read())
        zipped = zipfile.ZipFile(buffer)
        
        for file in zipped.namelist():
            logger.info(f'current file in zipfile: {file}')
            final_file_path = file + '.extension'

            with zipped.open(file, "r") as f_in:
                content = f_in.read()
                destinationbucket.upload_fileobj(io.BytesIO(content),
                                                        final_file_path,
                                                        ExtraArgs={"ContentType": "text/plain"}
                                                )

There's also a tutorial here: https://betterprogramming.pub/unzip-and-gzip-incoming-s3-files-with-aws-lambda-f7bccf0099c9

x89
  • 2,798
  • 5
  • 46
  • 110