Consider I want to download only 10 files from the bucket, how do we pass 10 as an argument.
-
Why do you only want 10 files? Which 10 files would you want? Random ones? What is your use-case? Please provide more information and you'll have a better chance of obtaining a useful answer. (Hint: There is no such argument, but if we understand *why* you're asking for it, we can probably provide an alternative.) – John Rotenstein Mar 20 '18 at 04:24
-
My use case is, for every 30 Min I need to download 10 files random and import it to my system, basically I should define how many files I want to download from S3 and import to my system. – purushotham nikhil Mar 20 '18 at 05:31
-
For tips on asking a good question, please see: [How do I ask a good question?](http://stackoverflow.com/help/how-to-ask) – John Rotenstein Mar 21 '18 at 07:00
2 Answers
The easiest way to do so is to make a python script that you can run every 30 minutes.I have written the python code that will do your work :
import boto3
import random
s3 = boto3.client('s3')
source=boto3.resource('s3')
keys = []
resp = s3.list_objects_v2(Bucket='bucket_name')
for obj in resp['Contents']:
keys.append(obj['Key'])
length = len(keys);
for x in range(10):
hello=random.randint(0,length)
source.meta.client.download_file('bucket_name', keys[hello] , keys[hello])
In line 12 you can pass a number as an argument that will define the number of random files you want to download. Further if you want your script to execute the task automatically every 30 minutes, then you can define above code as a separate method and then can use "sched" module of python to call this method repeatedly for which you can find the code in the link here: What is the best way to repeatedly execute a function every x seconds in Python?

- 852
- 5
- 10
Your use case appears to be:
- Every 30 minutes
- Download 10 random files from Amazon S3
Presumably, these 10 files should not be files previously downloaded.
There is no in-built S3 functionality to download a random selection of files. Instead, you will need to:
- Obtain a listing of files from your desired S3 bucket and optional path
- Randomly select which files you want to download
- Download the selected files
This would be easily done via a programming language (eg Python), where you could obtain an array of filenames, randomize it, then loop through the list and download each file.
You can also do it in a shell script by calling the AWS Command-Line Interface (CLI) to obtain the listing (aws s3 ls
) and to copy the files (aws s3 cp
).
Alternatively, you could choose to synchronize ALL the files to your local machine (aws s3 sync
) and then select random local files to process.
Try the above steps. If you experience difficulties, post your code and the error/problem you are experiencing and we can assist.

- 241,921
- 22
- 380
- 470