92

How to exclude multiple folders while using aws s3 syn ?

I tried :

    # aws s3 sync s3://inksedge-app-file-storage-bucket-prod-env \ 
                  s3://inksedge-app-file-storage-bucket-test-env \
                  --exclude 'reportTemplate/* orders/* customers/*'

But still it's doing sync for folder "customer"

Output :

    copy: s3://inksedge-app-file-storage-bucket-prod-env/customers/116/miniimages/IMG_4800.jpg
       to s3://inksedge-app-file-storage-bucket-test-env/customers/116/miniimages/IMG_4800.jpg

    copy: s3://inksedge-app-file-storage-bucket-prod-env/customers/116/miniimages/DSC_0358.JPG
       to s3://inksedge-app-file-storage-bucket-test-env/customers/116/miniimages/DSC_0358.JPG
eth4io
  • 288
  • 4
  • 15
Ashish Karpe
  • 5,087
  • 7
  • 41
  • 66
  • I believe you need the --exclude option for each pattern, i.e: `--exclude 'reportTemplate/*' --exclude 'orders/*' --exclude 'customers/*'`. putting the whole thing in quotes like this most likely considers the whole thing to be a single pattern. – Florian Castellane Jul 22 '19 at 04:14

4 Answers4

165

At last this worked for me:

aws s3 sync s3://my-bucket s3://my-other-bucket \
            --exclude 'customers/*' \
            --exclude 'orders/*' \
            --exclude 'reportTemplate/*'  

Hint: you have to enclose your wildcards and special characters in single or double quotes to work properly. Below are examples of matching characters. for more information regarding S3 commands, check it in amazon here.

*: Matches everything
?: Matches any single character
[sequence]: Matches any character in sequence
[!sequence]: Matches any character not in sequence
Hosam Aly
  • 41,555
  • 36
  • 141
  • 182
Ashish Karpe
  • 5,087
  • 7
  • 41
  • 66
  • 1
    Amazon provides AWS CLI, a command line tool for interacting with AWS. With AWS CLI, that entire process took less than three seconds: $ aws s3 sync s3:/// For example aws s3 sync s3://s3.aws-cli.demo/photos/office ~/Pictures/work – Tapan Banker Nov 03 '19 at 21:35
30

For those who are looking for sync some subfolder in a bucket, the exclude filter applies to the files and folders inside the folder that is be syncing, and not the path with respect to the bucket, example:

aws s3 sync s3://bucket1/bootstrap/ s3://bucket2/bootstrap --exclude '*' --include 'css/*'

would sync the folder bootstrap/css but not bootstrap/js neither bootstrap/fonts in the following folder tree:

bootstrap/
├── css/
│   ├── bootstrap.css
│   ├── bootstrap.min.css
│   ├── bootstrap-theme.css
│   └── bootstrap-theme.min.css
├── js/
│   ├── bootstrap.js
│   └── bootstrap.min.js
└── fonts/
    ├── glyphicons-halflings-regular.eot
    ├── glyphicons-halflings-regular.svg
    ├── glyphicons-halflings-regular.ttf
    └── glyphicons-halflings-regular.woff

That is, the filter is 'css/*' and not 'bootstrap/css/*'

More in https://docs.aws.amazon.com/cli/latest/reference/s3/index.html#use-of-exclude-and-include-filters

  • Thanks, that's the only answer that helped me. But what is the logic behind it, that is why a filter `bootstrap/css/*` will not work? – Itamar Katz Dec 08 '22 at 16:43
  • @ItamarKatz it is because the filter applies to the folder selected, so it would actually be looking to include a folder `s3://bucket2/bootstrap/bootstrap/css/*` if given the filter you provided. – Jon Jun 20 '23 at 18:18
4

From a Windows command prompt, single quotes ' don't work, only double quotes " work so use " " around wildcards, eg:

aws s3 sync  s3://bucket-1/ . --exclude "reportTemplate/*" --exclude "orders/*"

Single quote doesn't work (as tested with the --dryrun option) on Windows 10.

0

I used a bit of a different way when we have multiple levels of folder structure. Use '**' with --include

Command:

aws s3 sync s3://$SOURCE_BUCKET/dir1/dir2/  s3://$TARGET_BUCKET/dir1/dir2/ --include "\**/**'
sumitya
  • 2,631
  • 1
  • 19
  • 32