0

Let's say I have the following files in a directory:

snackbox_1a.dat
zebrabar_3z.dat
cornrows_00.dat
meatpack_z2.dat

I have SEVERAL of these directories, in which all of the files are of the same format, ie:

snackbox_xx.dat
zebrabar_xx.dat
cornrows_xx.dat
meatpack_xx.dat

So what I KNOW about these files is the first bit (snackbox, zebrabar, cornrows, meatpack). What I don't know is the bit for the file extension (the 'xx'). This changes both within the directory across the files, and across the directories (so another directory might have different xx values, like 12, yy, 2m, 0t, whatever).

Is there a way for me to rename all of these files, or truncate them all (since the xx.dat will always be the same length), for ease of use when attempting to call them? For instance, I'd like to rename them so that I can, in another script, use a simple index to step through and find the file I want (instead of having to go into each directory and pull the file out manually).

In other words, I'd like to change the file names to:

snackbox.dat
zebrabar.dat
cornrows.dat
meatpack.dat

Thanks!

sct_2015
  • 29
  • 1
  • 3
  • 7

3 Answers3

2

You can use shutil.move to move files. To calculate the new filename, you can use Python's string split method:

original_name = "snackbox_12.dat"
truncated_name = original.split("_")[0] + ".dat"
Joshua Gevirtz
  • 401
  • 3
  • 14
1

Try re.sub:

import re
filename = 'snackbox_xx.dat'
filename_new = re.sub(r'_[A-Za-z0-9]{2}', '', filename)

You should get 'snackbox.dat' for filename_new

This assumes the two characters after the "_" are either a number or lowercase/uppercase letter, but you could choose to expand the classes included in the regular expression.

EDIT: including moving and recursive search:

import shutil, re, os, fnmatch
directory = 'your_path'

for root, dirnames, filenames in os.walk(directory):
    for filename in fnmatch.filter(filenames, '*.dat'):
        filename_new = re.sub(r'_[A-Za-z0-9]{2}', '', filename)
        shutil.move(os.path.join(root, filename), os.path.join(root, filename_new))
Community
  • 1
  • 1
sharshofski
  • 147
  • 6
  • How do you suggest finding all the files which is probably about 80 percent of what the OP is asking? – Padraic Cunningham Feb 19 '15 at 19:30
  • glob only searches a single directory, you would need to recursively check – Padraic Cunningham Feb 19 '15 at 19:32
  • You would also need os.path.join and to check that you are not changing the name of files you should not be. `foo_22.txt` will also match your regex – Padraic Cunningham Feb 19 '15 at 19:36
  • It'll be no problem finding the files. The names I offered were generalized. The actual file names are much more complicated, but have a nice indexing feature at the end which I can step through with for loops pretty easily (I've already done it to copy the directories, which share the same indexing). I was really just looking for a single command which would take an unknown filename and rename it. Will try out all of these suggestions. Might take some time as I have other tasks, but I will revisit when I've come to a suitable conclusion. Thanks everyone! – sct_2015 Feb 19 '15 at 19:45
0

This solution renames all files in the current directory that match the pattern in the function call.

What the function does

snackbox_5R.txt  >>>  snackbox.txt
snackbox_6y.txt  >>>  snackbox_0.txt
snackbox_a2.txt  >>>  snackbox_1.txt
snackbox_Tm.txt  >>>  snackbox_2.txt

Let's look at the functions inputs and some examples.

list_of_files_names This is a list of string. Where each string is the filename without the _?? part.

Examples:

  • ['snackbox.txt', 'zebrabar.txt', 'cornrows.txt', 'meatpack.txt', 'calc.txt']

  • ['text.dat']

upper_bound=1000 This is an integer. When the ideal filename is already taken, e.g snackbox.dat already exist it will create snackbox_0.dat all the way up to snackbox_9999.dat if need be. You shouldn't have to change the default.


The Code

import re
import os
import os.path


def find_and_rename(dir, list_of_files_names, upper_bound=1000):
    """
    :param list_of_files_names: List. A list of string: filname (without the _??) + extension, EX: snackbox.txt
    Renames snackbox_R5.dat to snackbox.dat, etc.
    """
    # split item in the list_of_file_names into two parts, filename and extension "snackbox.dat" -> "snackbox", "dat"
    list_of_files_names = [(prefix.split('.')[0], prefix.split('.')[1]) for prefix in list_of_files_names]

    # store the content of the dir in a list
    list_of_files_in_dir = os.listdir(dir)

    for file_in_dir in list_of_files_in_dir:  # list all files and folders in current dir
        file_in_dir_full_path = os.path.join(dir, file_in_dir)  # we need the full path to rename to use .isfile()
        print()  # DEBUG
        print('Is "{}" a file?: '.format(file_in_dir), end='')  # DEBUG
        print(os.path.isfile(file_in_dir_full_path))  # DEBUG
        if os.path.isfile(file_in_dir_full_path):  # filters out the folder, only files are needed

            # Filename is a tuple containg the prefix filename and the extenstion
            for file_name in list_of_files_names:  # check if the file matches on of our renaming prefixes

                # match both the file name (e.g "snackbox") and the extension (e.g "dat")
                # It find "snackbox_5R.txt" by matching "snackbox" in the front and matching "dat" in the rear
                if re.match('{}_\w+\.{}'.format(file_name[0], file_name[1]), file_in_dir):
                    print('\nOriginal File: ' + file_in_dir)  # printing this is not necessary
                    print('.'.join(file_name))

                    ideal_new_file_name = '.'.join(file_name)  # name might already be taken
                    # print(ideal_new_file_name)
                    if os.path.isfile(os.path.join(dir, ideal_new_file_name)):  # file already exists
                        # go up a name, e.g "snackbox.dat" --> "snackbox_1.dat" --> "snackbox_2.dat
                        for index in range(upper_bound):
                            # check if this new name already exists as well
                            next_best_name = file_name[0] + '_' + str(index) + '.' + file_name[1]

                            # file does not already exist
                            if os.path.isfile(os.path.join(dir,next_best_name)) == False:
                                print('Renaming with next best name')
                                os.rename(file_in_dir_full_path, os.path.join(dir, next_best_name))
                                break

                            # this file exist as well, keeping increasing the name
                            else:
                                pass

                    # file with ideal name does not already exist, rename with the ideal name (no _##)
                    else:
                        print('Renaming with ideal name')
                        os.rename(file_in_dir_full_path, os.path.join(dir, ideal_new_file_name))


def find_and_rename_include_sub_dirs(master_dir, list_of_files_names, upper_bound=1000):
    for path, subdirs, files in os.walk(master_dir):
        print(path)  # DEBUG
        find_and_rename(path, list_of_files_names, upper_bound)


find_and_rename_include_sub_dirs('C:/Users/Oxen/Documents/test_folder', ['snackbox.txt', 'zebrabar.txt', 'cornrows.txt', 'meatpack.txt', 'calc.txt'])
Vader
  • 6,335
  • 8
  • 31
  • 43