1

I am trying to create an rsync command to upload files into a bucket on Google Cloud Storage.

We want to go through the entire computer and upload only csv files into the bucket. I know that the rsync command for gsutil provides a command to exclude files (-x), but I am having trouble figuring out the regex to exclude all files except .csv files.

For example, if I have these files in these directories:

folder/folder/hello.csv

folder/folder/results.pdf

folder/folder2/test.txt

folder/folder3/hello2.csv

I expect to only upload:

folder/folder/hello.csv

folder/folder3/hello2.csv

Any idea on how this regex would look?

Thanks!

kennycodes
  • 526
  • 3
  • 11
  • 19
  • Does this answer your question? [rsync copy over only certain types of files using include option](https://stackoverflow.com/questions/11111562/rsync-copy-over-only-certain-types-of-files-using-include-option) – Cyrus Jun 23 '21 at 21:12
  • 1
    @Cyrus I'm not sure if gsutil rsync command is the same as the native rsync command, I was looking at this documentation https://cloud.google.com/storage/docs/gsutil/commands/rsync and it seems that there isn't an include or exclude option. – kennycodes Jun 23 '21 at 21:16
  • 1
    `find . -name '*.csv' | xargs -I {} gsutil cp {} gs://` – jabbson Jun 23 '21 at 23:46

1 Answers1

4

The gsutil rsync command takes an option -x to exclude patterns given in Python regex syntax. The latter link has an example of using negative lookahead to match "all but" the filename extension .bat. So you could try

-n -x '.*[.](?!csv$)[^.]*$'

The -n is to show what would happen.

meuh
  • 11,500
  • 2
  • 29
  • 45