0

I'm trying to get last added file in S3 specific folder. and I refer this(How to download the latest file of an S3 bucket using Boto3?) post and try.

@api_view(("GET",))
def get_protocol(request, pk):

    union = Union.objects.get(pk=pk)
    s3 = get_s3_client()
    filepath = "media/private/" + str(pk) + "/certificate/docx"

    get_last_modified = lambda obj: int(obj["LastModified"].strftime("%s"))
    objs = s3.list_objects_v2(
        Bucket="unifolio",
        Prefix=filepath + "/" + "Union" + str(pk) + "_" + "certificate" + "3",
    )
    last_added = [obj["Key"] for obj in sorted(objs, key=get_last_modified)][0]
    url = s3.generate_presigned_url(
        ClientMethod="get_object",
        Params={"Bucket": "unifolio", "Key": last_added},
        # url 생성 후 60초가 지나면 접근 불가
        ExpiresIn=60,
    )

    return Response()

but the error occur like below:

  File "/Users/kimdoehoon/project/unifolio/unions/api_views.py", line 199, in get_protocol
    objs_sorted = [obj["Key"] for obj in sorted(objs, key=get_last_modified)]
  File "/Users/kimdoehoon/project/unifolio/unions/api_views.py", line 194, in <lambda>
    get_last_modified = lambda obj: int(obj["LastModified"].strftime("%s"))
TypeError: string indices must be integers

I don't recognize why indices must be integer error. Could anybody kindly help me?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
dhooonk
  • 2,115
  • 3
  • 11
  • 17

1 Answers1

2

Something strange is happening with your sorting method.

Here's some code that uses an S3 Resource to retrieve the latest modified object:

import boto3

s3_resource = boto3.resource('s3')

objects = list(s3_resource.Bucket('my-bucket').objects.filter(Prefix='my-folder/'))
objects.sort(key=lambda o: o.last_modified)

print(objects[-1].key)
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • this doesn't work for me, no error but the code just doesn't ends to run – Bünyamin Şentürk Aug 25 '22 at 09:06
  • 3
    @BünyaminŞentürk If you have a very large number of objects in the bucket, it might take a long time to run. The S3 API only returns 1000 objects at a time. If you have 100,000+ objects, it is best to avoid listing the contents too often. You can instead use [Amazon S3 Inventory](https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-inventory.html), which can provide a daily or weekly CSV file listing all objects. – John Rotenstein Aug 25 '22 at 13:18