Downloading an entire S3 bucket?

Question

I noticed that there does not seem to be an option to download an entire s3 bucket from the AWS Management Console.

Is there an easy way to grab everything in one of my buckets? I was thinking about making the root folder public, using wget to grab it all, and then making it private again but I don't know if there's an easier way.

As many people here said, `aws s3 sync` is the best. But nobody pointed out a powerful option: `dryrun`. This option allows you to see what would be downloaded/uploaded from/to s3 when you are using `sync`. This is really helpful when you don't want to overwrite content either in your local or in a s3 bucket. This is how is used: `aws s3 sync --dryrun` I used it all the time before pushing new content to a bucket in order to not upload undesired changes. — Perimosh, Oct 18 '18 at 16:21
Here's a quick video showing `aws s3 sync` in practice: https://www.youtube.com/watch?v=J2aZodwPeQk — Dennis Traub, Apr 01 '21 at 20:43
See **2021/09** complete answer: https://stackoverflow.com/a/68981037/8718377 — veben, Aug 30 '21 at 08:44
For a literal download only... `aws s3 cp s3://Bucket/Folder LocalFolder --recursive` — DanielBell99, Oct 04 '22 at 16:09

score 1884 · Accepted Answer · edited Apr 22 '20 at 17:07

1884

AWS CLI

See the "AWS CLI Command Reference" for more information.

AWS recently released their Command Line Tools, which work much like boto and can be installed using

sudo easy_install awscli

or

sudo pip install awscli

Once installed, you can then simply run:

aws s3 sync s3://<source_bucket> <local_destination>

For example:

aws s3 sync s3://mybucket .

will download all the objects in mybucket to the current directory.

And will output:

download: s3://mybucket/test.txt to test.txt
download: s3://mybucket/test2.txt to test2.txt

This will download all of your files using a one-way sync. It will not delete any existing files in your current directory unless you specify --delete, and it won't change or delete any files on S3.

You can also do S3 bucket to S3 bucket, or local to S3 bucket sync.

Check out the documentation and other examples.

Whereas the above example is how to download a full bucket, you can also download a folder recursively by performing

aws s3 cp s3://BUCKETNAME/PATH/TO/FOLDER LocalFolderName --recursive

This will instruct the CLI to download all files and folder keys recursively within the PATH/TO/FOLDER directory within the BUCKETNAME bucket.

edited Apr 22 '20 at 17:07

the Tin Man

158,662
42
215
303

answered Sep 12 '13 at 10:57

Layke

51,422
11
85
111

305

First run `aws configure` and add your `access key` and `secret access key` which can be found [here](https://console.aws.amazon.com/iam/home?#security_credential). – user2609980 May 17 '14 at 08:47
13

Go here for the windows installer http://aws.amazon.com/cli/. It picks up access key id from environment variable "AWS_ACCESS_KEY_ID" and your secret key from "AWS_SECRET_ACCESS_KEY". – Matt Bond Jul 18 '14 at 19:03
how can I use your solution if I have to perform pattern matching for downloading? My question: http://stackoverflow.com/questions/25086722/downloading-pattern-matched-entries-from-s3-bucket/25087286#25087286 – Shrikant Kakani Aug 01 '14 at 20:51
10

I've tried `s3cmd` and `Cyberduck`, but for me `awscli` was by far the fastest way to download ~70.000 files from my bucket. – Arjen Aug 22 '14 at 07:46
1

pip command is recommended over easy_install since it allows to UNINSTALL CLI if necessary. – Ivan Nikitin Aug 25 '14 at 18:19
14

Please note that while the question asked about download only, I believe this command will do a 2-way sync between your directory and S3. If you're not trying to upload anything, make sure the current directory is empty. – Jesse Crossen Nov 26 '14 at 19:40
1

Having used the CLI and s3cmd, the CLI is definitely the way to go. Working with a bucket that has ~1M files s3cmd would take forever just to get the list of files, where CLI instantly started moving. – jaredstenquist Dec 08 '14 at 15:32
3

@JesseCrossen from my experience, it's not a 2-way sync. It works with the concepts of source and destination - first argument being the source and the second one, the destination. – rapcal Nov 27 '15 at 16:58
19

@JesseCrossen That `aws s3 sync` command will not upload anything, but it will delete files locally if they don't exist on S3. See [the documentation](http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html). – Flimm Jul 08 '16 at 12:15
3

in the documentation: http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html --delete (boolean) Files that exist in the destination but not in the source are deleted during sync. so unless you actually enter --delete it will NOT delete files if I am not mistaken – Ramon Fincken Nov 11 '16 at 15:55
sync does not do a two way.. it is a one way from source to destination. Also, if you have lots of items in bucket it will be a good idea to create s3 endpoint so that download happens faster and no charges – Deepak Singhal Jan 26 '17 at 10:54
@Richvel - How does sync work when run multiple times (which of course must be done to get continuous sync) - does it compare file names or file hashes or something to compare what's already been copied? – Howiecamp Mar 12 '17 at 23:20
@Howiecamp - that's covered in [this doc section](http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html#examples) – RichVel Mar 13 '17 at 09:04
Remember if your bucket happens to be located in the new Frankfurt region to explicitly specify the region for the aws commmand: aws s3 sync s3://mybucket . --region eu-central-1 – nover Apr 06 '17 at 16:11
1) When I try to download the files using above command (aws s3 cp s3://bucketname/object.csv .), I get the following error: fatal error: An error occurred (400) when calling the HeadObject operation: Bad Request 2) I made the file public and then tried: aws s3 cp s3://bucketname/object.csv . fatal error: An error occurred (InvalidRequest) when calling the ListObjects operation: Missing required header for this request: x-amz-content-sha256 Any thoughts? – New Coder Jun 23 '17 at 19:09
Hi, what if my download fails midway while downloading from bucket to local machine. Is there a way to resume the download from where it failed using aws cli tools or will it download and override the files in local? – Jainam Jhaveri Aug 30 '17 at 10:51
6

"brew install awscli" for macOS users. – Conor Sep 26 '17 at 15:08
1

For anyone having an issue installing this via pip on macOS (anything higher than El Capitan), you need to run `sudo pip install awscli --upgrade --ignore-installed six`, due to SIP. See the [GitHub issues](https://github.com/pypa/pip/issues/3165#issuecomment-145856429) . – Othyn Nov 15 '17 at 15:03
You can also sync only certain directories in a bucket as in `aws s3 sync s3://my-bucket/path/to/directory .` – iGEL Feb 15 '18 at 07:48
I don't think this should be the accepted answer, it only works if you have less than 1000 objects in your bucket, otherwise you'll need a recursive function. the --recursive flag still only downloads 1000 files vs 1 file. with this behaviour it would be better to use sync – WindDude Aug 21 '19 at 17:45
You can download the AWS using ``` $ aws s3 sync s3:/// ``` – Tapan Banker Nov 03 '19 at 21:34
If you have any issues with aws not accessing the bucket, set the region with --region us-east-1 for example – Toby Okeke Apr 23 '21 at 07:22

score 195 · Answer 2 · edited Apr 22 '20 at 16:58

195

You can use s3cmd to download your bucket:

s3cmd --configure
s3cmd sync s3://bucketnamehere/folder /destination/folder

There is another tool you can use called rclone. This is a code sample in the Rclone documentation:

rclone sync /home/local/directory remote:bucket

edited Apr 22 '20 at 16:58

the Tin Man

158,662
42
215
303

answered Mar 10 '12 at 15:57

Phil M.

2,332
2
15
14

6

This is quite slow. Especially if you attempt to use it incrementally. Is there a solution that is multi-threaded so it can saturate the bandwidth? – Peter Lada Oct 08 '13 at 03:34
the solutions below this are better, more standard and open to more platforms – abc123 Dec 11 '13 at 19:58
This does not work for requester pays buckets (see http://arxiv.org/help/bulk_data_s3) :-( – Martin Thoma Jun 23 '14 at 16:08

score 120 · Answer 3 · edited Apr 22 '20 at 17:09

120

I've used a few different methods to copy Amazon S3 data to a local machine, including s3cmd, and by far the easiest is Cyberduck.

All you need to do is enter your Amazon credentials and use the simple interface to download, upload, sync any of your buckets, folders or files.

edited Apr 22 '20 at 17:09

the Tin Man

158,662
42
215
303

answered Oct 04 '12 at 21:14

wedocando

1,201
1
8
3

Cyberduck also makes it easy to download public files anonymously - s3cmd seems to require credentials – chrishiestand Feb 12 '14 at 00:57
Works great with Transmit too. – Undistraction Feb 07 '15 at 19:51
too slow in comparison to awscli – shuboy2014 Dec 20 '17 at 18:32
cyberduck crashes if having more than 60.000 folders in a bucket – Duna Feb 04 '18 at 22:32
Another one is Commandeer: https://getcommandeer.com. It supports S3 file browsing in a normal tree view. Support for downloading files is coming soon! – Alex Tamoykin Jul 14 '19 at 07:05
Hey everybody, here's an alternative I'm using daily in my job: https://s3browser.com/ — of course is not like using AWS CLI directly (in the sense that is slightly slower) but is fine for most tasks. – vcoppolecchia Dec 11 '19 at 15:58

score 92 · Answer 4 · edited Apr 22 '20 at 17:23

92

You've many options to do that, but the best one is using the AWS CLI.

Here's a walk-through:

Download and install AWS CLI in your machine:
- Install the AWS CLI using the MSI Installer (Windows).
- Install the AWS CLI using the Bundled Installer for Linux, OS X, or Unix.
Configure AWS CLI:

Make sure you input valid access and secret keys, which you received when you created the account.
Sync the S3 bucket using:
```
aws s3 sync s3://yourbucket /local/path
```
In the above command, replace the following fields:
- yourbucket >> your S3 bucket that you want to download.
- /local/path >> path in your local system where you want to download all the files.

edited Apr 22 '20 at 17:23

the Tin Man

158,662
42
215
303

answered Dec 12 '17 at 13:26

Darshan Lila

5,772
2
24
34

1

I used this instead of cyberduck, because cyberduck needs to "prepare" files before it starts downloading. For large amounts of files that seemed to take ages and I couldn't find information on what "preparing" actually does. CLI started downloading instantly – Tashows Apr 30 '19 at 14:26
1

make sure you have that `s3://` prefix in bucket name!!! With `aws s3 ls` you don't need that `s3://` prefix but you need for `cp` command. – cjmling Apr 15 '20 at 12:21

score 76 · Answer 5 · edited Apr 22 '20 at 17:18

76

To download using AWS S3 CLI:

aws s3 cp s3://WholeBucket LocalFolder --recursive
aws s3 cp s3://Bucket/Folder LocalFolder --recursive

To download using code, use the AWS SDK.

To download using GUI, use Cyberduck.

edited Apr 22 '20 at 17:18

the Tin Man

158,662
42
215
303

answered Apr 10 '17 at 11:23

Sarat Chandra

5,636
34
30

1

How to ignore some files or folder? – Nabin Jan 18 '18 at 03:31
5

@Nabin you can use --include & --exclude with wildcard to exclude some file or folder, like this: `aws s3 cp s3://my-bucket-name ./local-folder --recursive --include "*" --exclude "excludeFolder/*" --exclude "includeFolder/excludeFile.txt"` – DarkCenobyte Aug 12 '18 at 21:24

score 54 · Answer 6 · edited Apr 22 '20 at 17:33

54

The answer by @Layke is good, but if you have a ton of data and don't want to wait forever, you should read "AWS CLI S3 Configuration".

The following commands will tell the AWS CLI to use 1,000 threads to execute jobs (each a small file or one part of a multipart copy) and look ahead 100,000 jobs:

aws configure set default.s3.max_concurrent_requests 1000
aws configure set default.s3.max_queue_size 100000

After running these, you can use the simple sync command:

aws s3 sync s3://source-bucket/source-path s3://destination-bucket/destination-path

or

aws s3 sync s3://source-bucket/source-path c:\my\local\data\path

On a system with CPU 4 cores and 16GB RAM, for cases like mine (3-50GB files) the sync/copy speed went from about 9.5MiB/s to 700+MiB/s, a speed increase of 70x over the default configuration.

edited Apr 22 '20 at 17:33

the Tin Man

158,662
42
215
303

answered Jan 05 '18 at 13:24

James

3,551
1
28
38

6

this is the real answer. just tested it, from ec2 it transferred about 2.3GB/min. without the concurrent options about 1GB/min. lifesaver. – Karsten Mar 01 '19 at 07:14
1

This is great! Another tip: for configuring these values for a non-default profile, don't simply replace `default` with `profile-name`. Instead use this: `aws configure set s3.max_concurrent_requests 1000 --profile profile-name`. – Pravin Singh Mar 22 '22 at 03:51
These settings crashed my browser and stopped the download on my macbook air m1 16gb memory. Had to turn them down a bit. – Christopher Reid Apr 17 '23 at 17:54
@ChristopherReid, it's not surprising that a large download crashed your browser, these settings should have no effect on a browser anyway, You need to use the CLI or a purpose-built program to download a bucket of any significant size. – James Apr 18 '23 at 22:29
@James I was using the cli sync commands to download. I was downloading about 90GB of data. While the sync was happening with these settings my browser tabs (firefox) kept crashing. I took a zero/order of magnitude off of each of your recommended settings and it ran fine... but obviously a bit slower. – Christopher Reid Apr 19 '23 at 18:36

score 45 · Answer 7 · edited Jan 11 '23 at 15:35

45

100% works for me, i have download all files from aws s3 backet.

Install AWS CLI. Select your operating system and follow the steps here: Installing or updating the latest version of the AWS CLI
Check AWS version: aws --version

Run config command: aws configure

aws s3 cp s3://yourbucketname your\local\path --recursive

Eg (Windows OS): aws s3 cp s3://yourbucketname C:\aws-s3-backup\project-name --recursive

Check out this link: How to download an entire bucket from S3 to local folder

edited Jan 11 '23 at 15:35

Abdullah Khawer

4,461
4
29
66

answered May 06 '21 at 14:58

Najathi

2,529
24
23

1

thanks, will it maintain the folder structure inside the bucket? – I. Afrin May 26 '21 at 10:29

score 30 · Answer 8 · edited Apr 22 '20 at 17:21

30

If you use Visual Studio, download "AWS Toolkit for Visual Studio".

After installed, go to Visual Studio - AWS Explorer - S3 - Your bucket - Double click

In the window you will be able to select all files. Right click and download files.

edited Apr 22 '20 at 17:21

the Tin Man

158,662
42
215
303

answered Feb 04 '14 at 00:23

Ives.me

2,359
18
24

score 26 · Answer 9 · edited Apr 22 '20 at 17:20

26

For Windows, S3 Browser is the easiest way I have found. It is excellent software, and it is free for non-commercial use.

edited Apr 22 '20 at 17:20

the Tin Man

158,662
42
215
303

answered Dec 19 '12 at 07:38

dworrad

709
8
14

4

I just tried the "Download All Files to..." option (which I presume is equivalent to the "download entire s3 bucket" and it said I need to the Pro version. – Jack Ukleja Aug 11 '13 at 17:57
3

Update: But I was able to download an entire folder within the bucket which was sufficient for my needs... – Jack Ukleja Aug 11 '13 at 18:02
yeah the free version is pretty limited, you can select all, and download, but limited to only 2 simultaneous transfers – Hayden Thring Dec 19 '15 at 23:24
Was looking for a windows simple version after getting some python3 support error on Ubuntu 17.1 and s3cmd, this worked well. – edencorbin Oct 25 '17 at 11:06

score 21 · Answer 10 · edited Oct 24 '19 at 17:58

21

Use this command with the AWS CLI:

aws s3 cp s3://bucketname . --recursive

edited Oct 24 '19 at 17:58

jedierikb

12,752
22
95
166

answered Nov 04 '16 at 21:30

ashack

1,192
11
17

score 17 · Answer 11 · edited Feb 14 '22 at 11:52

17

Another option that could help some OS X users is Transmit.

It's an FTP program that also lets you connect to your S3 files. And, it has an option to mount any FTP or S3 storage as a folder in the Finder, but it's only for a limited time.

edited Feb 14 '22 at 11:52

Abdullah Khawer

4,461
4
29
66

answered Oct 17 '13 at 07:50

Diederik

377
3
13

Harsh Manvar · Answer 12 · 2023-03-15T20:19:32.583

15

AWS SDK API is only the best option for uploading entire folder and repository to AWS S3 and to download entire AWS S3 bucket locally.

To upload whole folder to AWS S3: aws s3 sync . s3://BucketName

To download whole AWS S3 bucket locally: aws s3 sync s3://BucketName .

You can also assign path like BucketName/Path for particular folder in AWS S3 bucket to download.

edited Mar 15 '23 at 20:19

answered Dec 27 '18 at 04:24

Harsh Manvar

27,020
6
48
102

score 13 · Answer 13 · edited Apr 22 '20 at 17:28

I've done a bit of development for S3 and I have not found a simple way to download a whole bucket.

If you want to code in Java the jets3t lib is easy to use to create a list of buckets and iterate over that list to download them.

First, get a public private key set from the AWS management consule so you can create an S3service object:

AWSCredentials awsCredentials = new AWSCredentials(YourAccessKey, YourAwsSecretKey);
s3Service = new RestS3Service(awsCredentials);

Then, get an array of your buckets objects:

S3Object[] objects = s3Service.listObjects(YourBucketNameString);

Finally, iterate over that array to download the objects one at a time with:

S3Object obj = s3Service.getObject(bucket, fileName);
            file = obj.getDataInputStream();

I put the connection code in a threadsafe singleton. The necessary try/catch syntax has been omitted for obvious reasons.

If you'd rather code in Python you could use Boto instead.

After looking around BucketExplorer, "Downloading the whole bucket" may do what you want.

Unless you need a Java solution use the aws cli answer above. — jeremyjjbrown, Sep 25 '14 at 15:23

score 9 · Answer 14 · edited Jan 11 '23 at 15:41

9

AWS CLI is the best option to download an entire S3 bucket locally.

Install AWS CLI.
Configure AWS CLI for using default security credentials and default AWS Region.
To download the entire S3 bucket use command

aws s3 sync s3://yourbucketname localpath

Reference to AWS CLI for different AWS services: AWS Command Line Interface

edited Jan 11 '23 at 15:41

Abdullah Khawer

4,461
4
29
66

answered May 04 '19 at 14:36

singh30

1,335
17
22

score 7 · Answer 15 · edited Jan 11 '23 at 15:39

7

You can do this with MinIO Client as follows: mc cp -r https://s3-us-west-2.amazonaws.com/bucketName/ localdir

MinIO also supports sessions, resumable downloads, uploads and many more. MinIO supports Linux, OS X and Windows operating systems. It is written in Golang and released under Apache Version 2.0.

edited Jan 11 '23 at 15:39

Abdullah Khawer

4,461
4
29
66

answered Nov 12 '15 at 00:43

Krishna Srinivas

1,620
1
13
19

score 7 · Answer 16 · answered Sep 20 '18 at 07:06

If you only want to download the bucket from AWS, first install the AWS CLI in your machine. In terminal change the directory to where you want to download the files and run this command.

aws s3 sync s3://bucket-name .

If you also want to sync the both local and s3 directories (in case you added some files in local folder), run this command:

aws s3 sync . s3://bucket-name

score 7 · Answer 17 · answered Mar 18 '19 at 13:49

To add another GUI option, we use WinSCP's S3 functionality. It's very easy to connect, only requiring your access key and secret key in the UI. You can then browse and download whatever files you require from any accessible buckets, including recursive downloads of nested folders.

Since it can be a challenge to clear new software through security and WinSCP is fairly prevalent, it can be really beneficial to just use it rather than try to install a more specialized utility.

score 6 · Answer 18 · edited Feb 14 '22 at 11:52

6

If you use Firefox with S3Fox, that DOES let you select all files (shift-select first and last) and right-click and download all.

I've done it with 500+ files without any problem.

edited Feb 14 '22 at 11:52

Abdullah Khawer

4,461
4
29
66

answered Sep 14 '12 at 06:23

jpw

18,697
25
111
187

This does not work for subfolders within a bucket, even if the "pseudo folders" were created in the AWS console. (As of the writing of this comment) – Wesley Feb 21 '13 at 05:25
Confirmed not working, I have about 12k top-level keys = subfolders), S3Fox does not even start up. Also insist on the permission to list all buckets! – Peter Lada Oct 08 '13 at 03:35

score 6 · Answer 19 · answered Mar 20 '22 at 11:01

6

You can use sync to download whole S3 bucket. For example, to download whole bucket named bucket1 on current directory.

aws s3 sync s3://bucket1 .

answered Mar 20 '22 at 11:01

user1445267

97
3
5

score 5 · Answer 20 · edited Feb 14 '22 at 11:54

5

When in Windows, my preferred GUI tool for this is CloudBerry Explorer Freeware for Amazon S3. It has a fairly polished file explorer and FTP-like interface.

edited Feb 14 '22 at 11:54

Abdullah Khawer

4,461
4
29
66

answered Oct 29 '13 at 03:15

fundead

1,321
1
14
15

score 5 · Answer 21 · answered Sep 25 '22 at 12:41

aws s3 sync s3://<source_bucket> <local_destination>

is a great answer, but it won't work if the objects are in storage class Glacier Flexible Retrieval, even if the the files have been restored. In that case you need to add the flag --force-glacier-transfer .

score 4 · Answer 22 · answered Feb 05 '16 at 23:16

If you have only files there (no subdirectories) a quick solution is to select all the files (click on the first, Shift+click on the last) and hit Enter or right click and select Open. For most of the data files this will download them straight to your computer.

score 4 · Answer 23 · edited Mar 18 '19 at 14:41

4

Try this command:

aws s3 sync yourBucketnameDirectory yourLocalDirectory

For example, if your bucket name is myBucket and local directory is c:\local, then:

aws s3 sync s3://myBucket c:\local

For more informations about awscli check this aws cli installation

edited Mar 18 '19 at 14:41

pheeleeppoo

1,491
6
25
29

answered Mar 08 '19 at 11:03

Primit

825
7
13

This piece of art works! This answer link, I am immediately bookmarking. Thanks! – Dev_Man Aug 12 '20 at 06:27

score 4 · Answer 24 · answered Oct 01 '20 at 19:46

4

It's always better to use awscli for downloading / uploading files to s3. Sync will help you to resume without any hassle.

aws s3 sync s3://bucketname/ .

answered Oct 01 '20 at 19:46

Jobin Joseph

177
2
3

1

What is new in this answer ? – Madhur Bhaiya Mar 16 '21 at 10:18

score 4 · Answer 25 · edited Jan 11 '23 at 15:44

4

In addition to the suggestions for aws s3 sync, I would also recommend looking at s5cmd.

In my experience I found this to be substantially faster than the AWS CLI for multiple downloads or large downloads.

s5cmd supports wildcards so something like this would work:

s5cmd cp s3://bucket-name/* ./folder

edited Jan 11 '23 at 15:44

Abdullah Khawer

4,461
4
29
66

answered Jun 02 '21 at 01:42

wrschneider

17,913
16
96
176

1

s5cmd uses golang to be 10x faster than awscli https://joshua-robinson.medium.com/s5cmd-for-high-performance-object-storage-7071352cc09d – vdm Aug 29 '21 at 16:15
yes, I don't fully understand the limiting factors in AWS CLI, or why Golang is so much faster than Python (Python GIL limits multi-threading, maybe?) – wrschneider Aug 29 '21 at 17:04

score 4 · Answer 26 · edited Jan 11 '23 at 15:42

Here is a summary of what you have to do to copy an entire bucket:

1. Create a user that can operate with AWS s3 bucket

Follow this official article: Configuration basics

Don't forget to:

tick "programmatic access" in order to have the possibility to deal with with AWS via CLI.
add the right IAM policy to your user to allow him to interact with the s3 bucket

2. Download, install and configure AWS CLI

See this link allowing to configure it: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html

You can use the following command in order to add the keys you got when you created your user:

$ aws configure
AWS Access Key ID [None]: <your_access_key>
AWS Secret Access Key [None]: <your_secret_key>
Default region name [None]: us-west-2
Default output format [None]: json

3. Use the following command to download content

You can a recursive cp commande, but aws sync command is f:

aws s3 sync s3://your_bucket /local/path

To see what would be the dowloaded files before really do the download, you can use the --dryrun option.
To improve speed, you can adjust s3 max_concurrent_requests and max_queue_size properties. See: http://docs.aws.amazon.com/cli/latest/topic/s3-config.html
You can exclude/include some files using --exclude and --include options. See: https://docs.aws.amazon.com/cli/latest/reference/s3/

For example, the below command will show all the .png file presents in the bucket. Replay the command without --dryrun to make the resulting files be downloaded.

aws s3 sync s3://your_bucket /local/path --recursive --exclude "*" --include "*.png" --dryrun

score 3 · Answer 27 · answered Jul 22 '16 at 12:49

Windows User need to download S3EXPLORER from this link which also has installation instructions :- http://s3browser.com/download.aspx
Then provide you AWS credentials like secretkey, accesskey and region to the s3explorer, this link contains configuration instruction for s3explorer:Copy Paste Link in brower: s3browser.com/s3browser-first-run.aspx
Now your all s3 buckets would be visible on left panel of s3explorer.
Simply select the bucket, and click on Buckets menu on top left corner, then select Download all files to option from the menu. Below is the screenshot for the same:

Bucket Selection Screen

Then browse a folder to download the bucket at a particular place
Click on OK and your download would begin.

score 3 · Answer 28 · edited Jan 11 '23 at 15:43

3

You just need to pass --recursive & --include "*" in the aws s3 cp command as follows: aws --region "${BUCKET_REGION}" s3 cp s3://${BUCKET}${BUCKET_PATH}/ ${LOCAL_PATH}/tmp --recursive --include "*" 2>&1

edited Jan 11 '23 at 15:43

Abdullah Khawer

4,461
4
29
66

answered Jun 19 '20 at 07:32

Praveen Gowda

156
1
5

score 2 · Answer 29 · answered Jan 26 '17 at 10:55

aws sync is the perfect solution. It does not do a two way.. it is a one way from source to destination. Also, if you have lots of items in bucket it will be a good idea to create s3 endpoint first so that download happens faster (because download does not happen via internet but via intranet) and no charges

score 2 · Answer 30 · answered Sep 24 '18 at 03:59

As @layke said, it is the best practice to download the file from the S3 cli it is a safe and secure. But in some cases, people need to use wget to download the file and here is the solution

aws s3 presign s3://<your_bucket_name/>

This will presign will get you temporary public URL which you can use to download content from S3 using the presign_url, in your case using wget or any other download client.

score 2 · Answer 31 · edited Aug 13 '23 at 16:05

there are three options allow to download files or folders from S3 bucket:

Option 1: Using AWS console
Option 2: Using AWS cli
Option 3: Using AWS SDK

About option 1:

Navigate to the file's location within the bucket.
Select the file by clicking on it.
Click the "Download" button from the top menu.
The file will be downloaded to your local machine. About option 2: Using the following command:

aws s3 cp s3://bucket-name/file-path local-file-path
# or 
aws s3 cp s3://bucket-name/folder-path local-folder-path --recursive

About option 3: In this case I am using boto3

import boto3
import os

def download_folder_from_s3(bucket_name, folder_name, local_path):
    s3 = boto3.client('s3')

    try:
        response = s3.list_objects_v2(Bucket=bucket_name, Prefix=folder_name)
        for obj in response.get('Contents', []):
            file_name = obj['Key']
            local_file_path = os.path.join(local_path, os.path.basename(file_name))
            s3.download_file(bucket_name, file_name, local_file_path)
            print(f"File '{file_name}' downloaded to '{local_file_path}'")
    except Exception as e:
        print(f"Error downloading folder: {e}")

# Replace with your actual bucket name, folder name, and local path
bucket_name = 'your-bucket-name'
folder_name = 'techvuehub-folder/'
local_path = './downloaded-files/'

if not os.path.exists(local_path):
    os.makedirs(local_path)

download_folder_from_s3(bucket_name, folder_name, local_path)

Check out How to download files from S3 bucket for more information. Hope the options above can help you easier in download files from S3.

score 1 · Answer 32 · answered Jun 23 '17 at 15:48

Here is some stuff to download all buckets, list them, list their contents.

    //connection string
    private static void dBConnection() {
    app.setAwsCredentials(CONST.getAccessKey(), CONST.getSecretKey());
    conn = new AmazonS3Client(app.getAwsCredentials());
    app.setListOfBuckets(conn.listBuckets());
    System.out.println(CONST.getConnectionSuccessfullMessage());
    }

    private static void downloadBucket() {

    do {
        for (S3ObjectSummary objectSummary : app.getS3Object().getObjectSummaries()) {
            app.setBucketKey(objectSummary.getKey());
            app.setBucketName(objectSummary.getBucketName());
            if(objectSummary.getKey().contains(CONST.getDesiredKey())){
                //DOWNLOAD
                try 
                {
                    s3Client = new AmazonS3Client(new ProfileCredentialsProvider());
                    s3Client.getObject(
                            new GetObjectRequest(app.getBucketName(),app.getBucketKey()),
                            new File(app.getDownloadedBucket())
                            );
                } catch (IOException e) {
                    e.printStackTrace();
                }

                do
                {
                     if(app.getBackUpExist() == true){
                        System.out.println("Converting back up file");
                        app.setCurrentPacsId(objectSummary.getKey());
                        passIn = app.getDataBaseFile();
                        CONVERT= new DataConversion(passIn);
                        System.out.println(CONST.getFileDownloadedMessage());
                    }
                }
                while(app.getObjectExist()==true);

                if(app.getObjectExist()== false)
                {
                    app.setNoObjectFound(true);
                }
            }
        }
        app.setS3Object(conn.listNextBatchOfObjects(app.getS3Object()));
    } 
    while (app.getS3Object().isTruncated());
}

/----------------------------Extension Methods-------------------------------------/

//Unzip bucket after download 
public static void unzipBucket() throws IOException {
    unzip = new UnZipBuckets();
    unzip.unZipIt(app.getDownloadedBucket());
    System.out.println(CONST.getFileUnzippedMessage());
}

//list all S3 buckets
public static void listAllBuckets(){
    for (Bucket bucket : app.getListOfBuckets()) {
        String bucketName = bucket.getName();
        System.out.println(bucketName + "\t" + StringUtils.fromDate(bucket.getCreationDate()));
    }
}

//Get the contents from the auto back up bucket
public static void listAllBucketContents(){     
    do {
        for (S3ObjectSummary objectSummary : app.getS3Object().getObjectSummaries()) {
            if(objectSummary.getKey().contains(CONST.getDesiredKey())){
                System.out.println(objectSummary.getKey() + "\t" + objectSummary.getSize() + "\t" + StringUtils.fromDate(objectSummary.getLastModified()));
                app.setBackUpCount(app.getBackUpCount() + 1);   
            }
        }
        app.setS3Object(conn.listNextBatchOfObjects(app.getS3Object()));
    } 
    while (app.getS3Object().isTruncated());
    System.out.println("There are a total of : " + app.getBackUpCount() + " buckets.");
}

}

score 1 · Answer 33 · answered May 13 '18 at 19:31

1

You may simple get it with s3cmd command:

s3cmd get --recursive --continue s3://test-bucket local-directory/

answered May 13 '18 at 19:31

Hubbitus

5,161
3
41
47

score 1 · Answer 34 · answered Jun 11 '18 at 10:07

As Neel Bhaat has explained in this blog, there are many different tools that can be used for this purpose. Some are AWS provided, where most are third party tools. All these tools require you to save your AWS account key and secret in the tool itself. Be very cautious when using third party tools, as the credentials you save in might cost you, your entire worth and drop you dead.

Therefore, I always recommend using the AWS CLI for this purpose. You can simply install this from this link. Next, run the following command and save your key, secret values in AWS CLI.

aws configure

And use the following command to sync your AWS S3 Bucket to your local machine. (The local machine should have AWS CLI installed)

aws s3 sync <source> <destination>

Examples:

1) For AWS S3 to Local Storage

aws s3 sync <S3Uri> <LocalPath>

2) From Local Storage to AWS S3

aws s3 sync <LocalPath> <S3Uri>

3) From AWS s3 bucket to another bucket

aws s3 sync <S3Uri> <S3Uri>

For example 3, can I point to a Bucket folder to another bucket folder? Actually, I want to sync a bucket folder to another bucket folder. — lukai, Dec 04 '18 at 09:09
@lukai yes. This is what i have given in Example 3. You simply need to have the s3 bucket URIs of source and destination — Keet Sugathadasa, Dec 04 '18 at 09:12

score 1 · Answer 35 · answered Jun 24 '19 at 08:06

If the bucket is quite big there is a command called s4cmd which makes parallel connections and improves the download time:

To install it on Debian like

apt install s4cmd

If you have pip:

pip install s4cmd

It will read the ~/.s3cfg file if present (if not install s3cmd and run s3cmd --configure) or you can specify --access-key=ACCESS_KEY --secret-key=SECRET_KEY on the command.

The cli is similar to s3cmd. In your case a sync is recommended as you can cancel the download and start it again without having to re-downloaded the files.

s4cmd [--access-key=ACCESS_KEY --secret-key=SECRET_KEY] sync s3://<your-bucket> /some/local/dir

Be careful if you download a lot of data (>1TB) this may impact your bill, calculate first which will be the cost

Manohar Vishwakarma · Answer 36 · 2023-07-03T09:11:10.657

Download AWS CLI to Download S3 Bucket Data

Step 1: Install the AWS CLI

If you haven't installed the AWS CLI already, you can follow the instructions in the AWS CLI User Guide for your specific operating system: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html

Step 2: Configure AWS CLI Open a command prompt or terminal.

aws configure

AWS Access Key ID [None]: <your_access_key>

AWS Secret Access Key [None]: <your_secret_key>

Default region name [None]: <YourBucketRegion>

Default output format [None]: json

Step 3: Download files from an S3 bucket

`aws s3 cp s3://<bucket-name>  --recursive`

Note: Ensure that the AWS CLI user or role associated with your credentials has the necessary permissions to access and download objects from the specified S3 bucket.

Dimuthu · Answer 37 · 2019-06-20T06:51:21.907

0

You can use this AWS cli command to download entire S3 bucket content to local folder

aws s3 sync s3://your-bucket-name "Local Folder Path"

If you see error like this

fatal error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)

--no-verify-ssl (boolean)

By default, the AWS CLI uses SSL when communicating with AWS services. For each SSL connection, the AWS CLI will verify SSL certificates. This option overrides the default behavior of verifying SSL certificates. reference

Use this tag with command --no-verify-ssl

aws s3 sync s3://your-bucket-name "Local Folder Path" --no-verify-ssl

edited Jun 20 '19 at 06:51

answered Jun 19 '19 at 23:59

Dimuthu

1,611
1
14
16

2

Use of the `s3 sync` is covered above multiple times already. + Suggesting a use of `--no-verify-ssl` without explaining its security consequences is a crime. – Martin Prikryl Jun 20 '19 at 05:55
Thanks for the information about security. I faced this issue and resolved it using this reference https://docs.aws.amazon.com/cli/latest/reference/ – Dimuthu Jun 20 '19 at 07:01

score 0 · Answer 38 · answered Sep 13 '21 at 22:11

use boto3 to download all objects in a bucket with a certain prefix

import boto3

s3 = boto3.client('s3', region_name='us-east-1', 
                     aws_access_key_id=AWS_KEY_ID, 
                     aws_secret_access_key=AWS_SECRET)

def get_all_s3_keys(bucket,prefix):
    keys = []

    kwargs = {'Bucket': bucket,Prefix=prefix}
    while True:
        resp = s3.list_objects_v2(**kwargs)
        for obj in resp['Contents']:
             keys.append(obj['Key'])

        try:
            kwargs['ContinuationToken'] = resp['NextContinuationToken']
        except KeyError:
            break

        return keys

def download_file(file_name, bucket,key):
    file=s3.download_file(
    Filename=file_name,
    Bucket=bucket,
    Key=key)
    return file

bucket="gid-folder"
prefix="test_"
keys=get_all_s3_keys(bucket,prefix):

for key in keys:
     download_file(key, bucket,key)

Downloading an entire S3 bucket?

38 Answers38

AWS CLI

1. Create a user that can operate with AWS s3 bucket

2. Download, install and configure AWS CLI

3. Use the following command to download content

Linked

Related