AWS S3 check if file exists based on a conditional path

Question

I would like to check if a file exists in a separate directory of the bucket if a given file exists. I have the following directory structure-

import boto3
s3 = boto3.resource('s3')
def file_exists(fileN):
    try:
        s3.Object('my-bucket', 'folder1/folder2/'+fileN).load()
    except:
        return False
    else:
        fileN = fileN.split(".")[0]
        try:

            s3.Object('my-bucket', 'folder1/<randomid folderxxxx>/'+fileN+'_condition.jpg').load()
        except:
            return False
        else:
            return True

file_exists("test.jpg")

This works but as long as I can send the randomfolderIDas an argument. Is there a better and elegant way to do it?

Basically I have to check if,

my-bucket/folder1/folder2/test.jpg if this exists then check my-bucket/folder1/<randomID>/test_condition.jpg if this also exists then return True

Additional options: https://stackoverflow.com/questions/33842944/check-if-a-key-exists-in-a-bucket-in-s3-using-boto3 — jarmod, May 29 '19 at 21:15
Are you saying that you want to be able to look for `bucket/folder1/*/test_condition.jpg`? That is, look in "any" folder for the object? — John Rotenstein, May 30 '19 at 00:23

Pavan K · Accepted Answer · 2019-05-31T13:18:42.903

I ended up using this which gave a little cleaner code

import boto3
s3client = boto3.client('s3')

def all_file_exist(bucket, prefix, fileN):
    fileFound = False
    fileConditionFound = False
    theObjs = s3client.list_objects_v2(Bucket=bucket, Prefix=prefix)
    for object in theObjs['Contents']:
        if object['Key'].endswith(fileN+'_condition.jpg') :
            fileConditionFound = True
        if object['Key'].endswith(fileN+".jpg") :
            fileFound = True
    if (fileFound and fileConditionFound) : 
        return True
    return False

all_file_exist("bucket","folder1", "test")

score 1 · Answer 2 · answered May 30 '19 at 12:17

It is not possible to specify an object key via a wildcard.

Instead, you would need to do a bucket listing (which can be against the whole bucket, or within a path) and then perform your own logic for identifying the file of interest.

If the number of objects is small (eg a few thousand), the list can be easily retrieved and kept in memory for fast comparison in a Python list.

If there are millions of objects, you might consider using Amazon S3 Inventory, which can provide a daily CSV file that lists all objects in the bucket. Using such a file would be faster than scanning the bucket itself.

AWS S3 check if file exists based on a conditional path

2 Answers2