-1

I have a bucket on Amazon S3 with thousands of files that contain double spaces in their names.

How can I replace all the double spaces with one space?

like: folder1/folder2/file name.pdf to folder1/folder2/file name.pdf

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Moshe Fortgang
  • 711
  • 4
  • 18

2 Answers2

2

Option 1: Use a spreadsheet

One 'cheat method' I sometimes use is to create a spreadsheet and then generate commands:

  • Extract a list of all files with double-spaces:
aws s3api list-objects --bucket bucket-name --query 'Contents[].[Key]' --output text | grep '\ \ ' >file_list.csv
  • Open the file in Excel
  • Write a formula in Column B that creates a aws s3 mv command:
="aws s3 mv 's3://bucket-name/"&A1&"' 's3://bucket-name/"&SUBSTITUTE(A1,"  "," ")&"'"
  • Test it by copying the output and running it in a terminal
  • If it works, Copy Down to the other rows, copy and paste all the commands into a shell script, then run the shell script

Option 2: Write a script

Or, you could write a script in your favourite language (eg Python) that will:

  • List the bucket
  • Loop through each object
  • If the object Key has double-spaces:
    • Copy the object to a new Key
    • Delete the original object
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
2

According to the idea from @john-rotenstein

I build bash command that makes it in one line

aws s3 ls  --recursive s3://bucket-name | cut -c32- | grep "\/.*  .*" | (IFS='' ; while read -r line ; do aws s3 mv  s3://bucket-name/"$line"  s3://bucket-name/$(echo "$line" | xargs)  --recursive; done) 
  • get the list paths of the bucket
  • cut the result to get the only file path
  • search all paths that contain double spaces
  • move to new path with one space
Moshe Fortgang
  • 711
  • 4
  • 18