0

I am trying to rename a number of files using Python so that they follow a new naming convention that looks like:

~/directory/yyyy + qq + directory_name + ' Letter'.

Right now, they are in this format:

~/directory/directory_name + yyyy + qq + ' Letter'.

So for example, I have a directory called /Users/Test/rename_test/Salmon 2, and in it are the following files:

  • /Users/Test/rename_test/Salmon 2/Salmon 2 2013 Q4 Letter.pdf
  • /Users/Test/rename_test/Salmon 2/Salmon 2 2018 Q1 Letter.pdf
  • /Users/Test/rename_test/Salmon 2/Salmon 2 2015 Q2 Letter.pdf

I'd like to rename all of those files to:

  • /Users/Test/rename_test/Salmon 2/2013 Q4 Salmon 2 Letter.pdf
  • /Users/Test/rename_test/Salmon 2/2018 Q1 Salmon 2 Letter.pdf
  • /Users/Test/rename_test/Salmon 2/2015 Q2 Salmon 2 Letter.pdf

I've looked at using os.split to extract positions [-2] and [-3]—since those should always be qq and yyyy—and then renaming the file by moving them to positions [0] and [1]. But I have hundreds of directories and thousands of files so I'm worried that one typo or file that deviates from the current convention may result in an error.

So what's the best way to approach this?

user53526356
  • 934
  • 1
  • 11
  • 25

3 Answers3

0

Your idea is basically sound - but you could add at least two steps to make it safer:

  1. employ a regex which detects your format and raises an error if it does not match. E.g. your regex could look like this:

    ^Salmon 2 \d{4} Q\d{1} Letter

You would need to replace Salmon 2 thorugh a variable which contains your current dir name

  1. Add a "dry run" mode - where the conversions are just printed to stdout - so you can see what would happen.

  2. Whatever you do, make a backup FIRST.

Bonus: I would probably use pathlib and its commands for this job - it's a more versatile interface than the old "files name are just strings" methods.

Christian Sauer
  • 10,351
  • 10
  • 53
  • 85
0

You can use either os.rename or shutil (see here).

In your case, if your naming structure is certain to be what you posted, then you can do it without regex:

import shutil
old_path = '/Users/Test/rename_test/Salmon 2/Salmon 2 2013 Q4 Letter.pdf'
x = old_path.split(os.sep)[-1].split()
x = ' '.join((x[2],x[3],x[0],x[1],x[3],x[4]))
new_path = os.sep.join(old_path.split(os.sep)[0:-1] + [x])
#shutil.move(old_path, new_path)
shutil.copyfile(old_path, new_path)

TRY it with print statements, please!


You'll have to wrap this with a for loop over all files. Better to copy to a new directory, and then manually confirm the results.

import glob
for f in glob.glob('/Users/Test/rename_test/Salmon 2/*.pdf'):
    # code above, where old_path = f

This is another reason to use copy instead of move, because if you rename the files to the same directory with the same extension, they will get picked up each time your run the code (and break the above code).

philshem
  • 24,761
  • 8
  • 61
  • 127
  • I don't think I can use ```x = ' '.join((x[2],x[3],x[0],x[1],x[3],x[4]))``` because some of the other directories have file names which maybe have 6 positions, or 9, etc. The only thing that is consistent with the existing naming convention are positions ```[-1]```, ```[-2]```, and ```[-3]```. For example, another directory has files named ```/Users/Test/rename_test/A Longer Example/A Longer Example 2013 Q4 Letter.pdf``` – user53526356 Apr 11 '19 at 21:35
  • yes, then you'd have to use the negative positions and then maybe `a[0:index]` to get the starting ones – philshem Apr 12 '19 at 06:14
0

You can try using re.sub() to perform the name manipulation. For example:

import re
import os

old_path = '/Users/Test/rename_test/Salmon 2/Salmon 2 2013 Q4 Letter.pdf'
old_filename = os.path.basename(old_path) # Salmon 2 2013 Q4 Letter.pdf
directory = os.path.basename(os.path.dirname(old_path)) # Salmon 2

new_filename = re.sub(r'('+ directory +')\s([0-9]{4})\s(Q[1-9]?[0-9]*)\sLetter\.pdf',r'\2 \3 \1 Letter.pdf', old_filename) # 2013 Q4 Salmon 2 Letter.pdf

As you can see, this method let you define a pattern, where you can identify groups with parenthesis (...) and then you can rearrange the string using those groups in the order you like. Each group can be selected with \n where n is the group number.

For more info: https://docs.python.org/3.6/library/re.html#re.sub

MingiuX
  • 100
  • 4