0

What's an efficient way to remove a list of special characters from a filename? I want to replace 'spaces' with '.' and '(', ')', '[',']' with '_'. I can do it for one, but I'm not sure how to rename multiple characters.

import os
import sys
files = os.listdir(os.getcwd())

for f in files:
    os.rename(f, f.replace(' ', '.'))
iheartcpp
  • 371
  • 1
  • 5
  • 14
  • 1
    have a look at: http://stackoverflow.com/questions/3411771/multiple-character-replace-with-python (and others) – Stidgeon Feb 06 '16 at 23:49
  • 1
    http://stackoverflow.com/questions/16720541/python-string-replace-regular-expression will point you to `re.sub` which will let you use a regular expression. – SteveTurczyn Feb 06 '16 at 23:51

2 Answers2

1

You could do a for loop that checks each character in the file name and replace:

import os
files = os.listdir(os.getcwd())
under_score = ['(',')','[',']'] #Anything to be replaced with '_' put in this list.
dot = [' '] #Anything to be replaced with '.' put in this list.

for f in files:
    copy_f = f
    for char in copy_f:
        if (char in dot): copy_f = copy_f.replace(char, '.')
        if (char in under_score): copy_f = copy_f.replace(char,'_')
    os.rename(f,copy_f)

The trick with this is the second for loop runs len(copy_f) times which will certainly replace all characters that match the criteria :) Also, there was no need for this import:

import sys
abe
  • 504
  • 4
  • 13
0

This solution works; and if you're request for efficiency is to avoid O(n^2) behavior for time complexity, then this should be OK.

import os

files = os.listdir(os.getcwd())
use_dots = set([' '])
use_underbar = set([')', '(', '[', ']'])

for file in files:
    tmp = []
    for char in file:
        if char in use_dots:
            tmp.append('.')
        elif char in use_underbar: #You added an s here
            tmp.append('_')
        else:
            tmp.append(char)
    new_file_name = ''.join(tmp)
    os.rename(file, new_file_name)

You could increase the efficienty of this if you started using a bytearray; this would avoid the 'tmp' list, and creating a new string with the subsequent join on it.

abe
  • 504
  • 4
  • 13
willnx
  • 1,253
  • 1
  • 8
  • 14