10

I have boto code that collects S3 sub-folders in levelOne folder:

import boto

s3 = boto.connect_s3()
bucket = s3.get_bucket("MyBucket")

for level2 in bucket.list(prefix="levelOne/", delimiter="/"):
    print(level2.name)

Please help to discover similar functionality in boto3. The code should not iterate through all S3 objects because the bucket has a very big number of objects.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Mikhail Orechkine
  • 101
  • 1
  • 1
  • 3

2 Answers2

15

If you are simply seeking a list of folders, then use CommonPrefixes returned when listing objects. Note that a Delimiter must be specified to obtain the CommonPrefixes:

import boto3

s3_client = boto3.client('s3')

response = s3_client.list_objects_v2(Bucket='BUCKET-NAME', Delimiter = '/')

for prefix in response['CommonPrefixes']:
    print(prefix['Prefix'][:-1])

If your bucket has a HUGE number of folders and objects, you might consider using Amazon S3 Inventory, which can provide a daily or weekly CSV file listing all objects.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
3

I think the following should be equivalent:

import boto3

s3 = boto3.resource('s3') 

bucket = s3.Bucket('MyBucket')

for object in bucket.objects.filter(Prefix="levelOne/", Delimiter="/"):
    print(object.key)
Marcin
  • 215,873
  • 14
  • 235
  • 294
  • 2
    In my experience, it is not. `Delimiter` and `boto3.resource` don't play along well. The issue is discussed [here](https://stackoverflow.com/a/51372405/8929855) and [here](https://stackoverflow.com/a/51372500/8929855). If starting from a `boto3.resource` is a hard requirement, getting the corresponding `boto3.client` through `{resource}.meta.client` seems to be the best solution – swimmer Sep 21 '22 at 14:48