1

I want to change this line

   files = os.listdir('/Users/milenko/mario/Json_gzips')

in my code,to read .gz files from my bucket straight into list. I tried

>>> import boto3
>>> s3 = boto3.resource('s3')
>>> s3
s3.ServiceResource()
>>> my_bucket = s3.Bucket('cw-dushpica-tests')

>>> for object_summary in my_bucket.objects.filter(Prefix='*.gz'):
...     print(object_summary)

There is no output,it does print nothing.

for object_summary in my_bucket.objects.filter(Prefix='/'):
...     print(object_summary)

The same,got nothing.

How should my prefix look like?

Djikii
  • 167
  • 2
  • 9
  • 2
    I think `Prefix` is like directory. You can getting all files and then try filtering. Check this please - https://stackoverflow.com/questions/3964681/find-all-files-in-a-directory-with-extension-txt-in-python/14001395 – Dmitry Leiko Jun 23 '20 at 12:20
  • @Dmitry What should I use for current dir? – Djikii Jun 23 '20 at 12:22
  • 1
    Also check it please - https://devqa.io/download-s3-objects-python-boto3/ – Dmitry Leiko Jun 23 '20 at 12:24

1 Answers1

5

The prefix parameter of the filter method means that

Prefix (string) -- Limits the response to keys that begin with the specified prefix.

So, you can limit the path to the specific folder and then filter by yourself for the file extension.

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('your_bucket')

keys = []

for obj in bucket.objects.filter(Prefix='path/to/files/'):
    
    if obj.key.endswith('gz'):
        
        keys.append(obj.key)
        
print(keys)
Lamanus
  • 12,898
  • 4
  • 21
  • 47