0

I have files that need to be pulled via AWS s3 CLI daily.

I'm doing this for ~80 files but they have static info besides a dynamic date which I solve for, but now we're introduce a new type. The filename of the new type:

ACR_{{randomInt}}_YYYY_MM_DD_ThiFil.csv

Currently I use a batch file to save yesterdays date as %yesterday% and this works by doing

aws s3 cp s3://~~~directorystuff/ACR_StaticInfo_%yesterday%_ThiFil.csv C:\localDirStuff~~~\ACR_StaticInfo_%yesterday%_ThiFil.csv

This works because of the static info. With the randomInt (Which I need to exist in the final file also) I'm having issues. I know AWS SLI uses --include to replace the lack of wildcard functionality, but I get a stream is not seekable every time I try.

What I'm currently doing - Is not scalable at all - I recursively pull the entire directory and delete everything not from today. I hate this method and it is not scalable at all.

How could I use AWS CLI to handle pulling only the files I need, specifically?

Note: The randomInt will change weekly and every day will have 30-40 different ones, which is why I can't keep an array to filter through to pull each one.

Update

I've also tried aws s3 cp C:\localDir\ s3://remoteDir --include "2017-10-12" and I am still getting the stream is not seekable

DNorthrup
  • 827
  • 2
  • 8
  • 26

1 Answers1

0

How can I use wildcards to `cp` a group of files with the AWS CLI

Following similar logic to this post, using --recursive and --include together seemed to correct my stream is not seekable issue.

For ref in the link:

aws s3 cp s3://data/ . --recursive --exclude "*" --include "2016-08*"

DNorthrup
  • 827
  • 2
  • 8
  • 26