1

I try to use a Lambda function via an API Gateway to get a file from a folder within an S3 Bucket. I want to do this to import the latest version of a csv file in PowerBI/Tableau for data analysis. I'm able to do this if I insert the filename. however, this obviously doesn't resolve in the latest file. I want the code to always take the latest file in that folder. I'm planning on doing this via the Last Modified attribute, not via the filename itself.

my Bucket looks like this BUCKET -Input -input file.csv -Output -Output18042019.csv -Output19042019.csv

The code I have, however, doesn't allow for searching within folders (as far as I'm aware) and neither does it seem to take any latest file. I have tried putting a file in the root folder of the bucket to see if it works, but it doesn't. How can I solve the problem?

import json
import boto3
from datetime import datetime


def lambda_handler(event, context):
    # TODO implement
    get_last_modified = lambda obj: int(obj['LastModified'].strftime('%s'))

    client = boto3.client('s3')
    objs= client.list_objects_v2(Bucket='BUCKET')['Contents']
    last_added = [obj['Key'] for obj in sorted(objs, key=get_last_modified)][0]

    bucket='BUCKET'
    link = client.generate_presigned_url('get_object', {'Bucket': bucket, 'Key': last_added}, 7200, 'GET')
    return {
        "statusCode": 303,
        "headers": {'Location': link}
    }

The Error I'm getting is the following:

Response:
{
  "stackTrace": [
    [
      "/var/task/lambda_function.py",
      11,
      "lambda_handler",
      "objs= client.list_objects_v2(Bucket='BUCKET')['Contents']"
    ],
    [
      "/var/runtime/botocore/client.py",
      314,
      "_api_call",
      "return self._make_api_call(operation_name, kwargs)"
    ],
    [
      "/var/runtime/botocore/client.py",
      612,
      "_make_api_call",
      "raise error_class(parsed_response, operation_name)"
    ]
  ],
  "errorType": "ClientError",
  "errorMessage": "An error occurred (AllAccessDisabled) when calling the ListObjectsV2 operation: All access to this object has been disabled"
}

Request ID:
"afd650be-4841-43cf-9cf4-731390bea1ce"

Function Logs:
START RequestId: afd650be-4841-43cf-9cf4-731390bea1ce Version: $LATEST
An error occurred (AllAccessDisabled) when calling the ListObjectsV2 operation: All access to this object has been disabled: ClientError
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 11, in lambda_handler
    objs= client.list_objects_v2(Bucket='BUCKET')['Contents']
  File "/var/runtime/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/var/runtime/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
ClientError: An error occurred (AllAccessDisabled) when calling the ListObjectsV2 operation: All access to this object has been disabled

END RequestId: afd650be-4841-43cf-9cf4-731390bea1ce
REPORT RequestId: afd650be-4841-43cf-9cf4-731390bea1ce  Duration: 2125.21 ms    Billed Duration: 2200 ms    Memory Size: 128 MB Max Memory Used: 58 MB  
  • Can you clarify the problem you have with your code? Any error messages? Can you `print last_added`? – jogold Apr 19 '19 at 11:18
  • @jogold I added the code now. sorry for the late reply! – Filip van der Pol Apr 23 '19 at 13:49
  • This is when you are calling with `Bucket='BUCKET'` right? What happens when calling with the right bucket name? – jogold Apr 23 '19 at 14:01
  • `Response: { "errorMessage": "2019-04-24T15:35:25.738Z 1eef59b9-a1c5-4057-8c1c-ded6e8e99109 Task timed out after 3.00 seconds" } Request ID: "1eef59b9-a1c5-4057-8c1c-ded6e8e99109" Function Logs: START RequestId: 1eef59b9-a1c5-4057-8c1c-ded6e8e99109 Version: $LATEST END RequestId: 1eef59b9-a1c5-4057-8c1c-ded6e8e99109 REPORT RequestId: 1eef59b9-a1c5-4057-8c1c-ded6e8e99109 Duration: 3002.80 ms Billed Duration: 3000 ms Memory Size: 128 MB Max Memory Used: 60 MB 2019-04-24T15:35:25.738Z 1eef59b9-a1c5-4057-8c1c-ded6e8e99109 Task timed out after 3.00 seconds` – Filip van der Pol Apr 24 '19 at 15:36
  • What is the length of `objs`? Can you log it in CloudWatch? – jogold Apr 24 '19 at 15:41
  • @jogold I've been following the answer on [link](https://stackoverflow.com/questions/45375999/how-to-download-the-latest-file-of-an-s3-bucket-using-boto3) , to try to get the result I want. I don't have a lot of knowledge of AWS as this is the first time I'm using it. how can I load the latest csv file in the output folder in my bucket into a data visualisation tool like PowerBI or Tableau without the use of RedShift? maybe there is an easier solution than a lambda function? – Filip van der Pol Apr 25 '19 at 08:43

0 Answers0