1

all.

I have around 900 objects in the following structure:

s3/bucketname/INGESTIONDATE=2022-02-20/file.csv
s3/bucketname/INGESTIONDATE=2022-02-21/file.csv
s3/bucketname/INGESTIONDATE=2022-02-22/file.csv

etc..

I need to change the file path to:

s3/bucketname/ingest_date=2022-02-20/file.json
s3/bucketname/ingest_date=2022-02-21/file.json
s3/bucketname/ingest_date=2022-02-22/file.json

This is around 900 objects so I do not plan on doing it by hand through the console.

Also, I am not too bothered about the JSON conversion, I can do that. It is mainly changing the filepaths and copying to a new bucket. Any ideas?

Zack Amin
  • 514
  • 4
  • 12

1 Answers1

1

You can do something like using aws cli -

aws s3 ls s3://bucketname/ | awk '{print $4}' | while read f; do newf=${f//INGESTIONDATE/ingest_date/} && aws s3 mv s3://bucketname/${f} s3://bucketname/${newf%???}json ; done
  • aws s3 ls s3://bucketname/ - lists the files in the bucket
  • newf=${f//INGESTIONDATE/ingest_date/} replace INGESTIONDATE with ingest_date and store in newf
  • We iterate through the file names as f and corresponding newf
  • And move the suffix of file from csv to json(${newf%???} removes the last 3 chars of f) using aws mv command

For testing, you can run aws mv with --dry-run command to see the mv commands which would be ran.

As pointed out by @Konrad, this assumes that there are no whitespace/newline chars in the filepath(which would make the awk return truncated file path).

Max
  • 668
  • 4
  • 13
  • 1
    Careful — just like POSIX files, S3 objects can have pretty much arbitrary characters in their names, including whitespace and newlines. Your code snippet will fail to handle these correctly. – Konrad Rudolph Feb 22 '22 at 10:29