0

I want to clean the name of a file but ONLY for the special characters not allowed:

char_not_supported_by_file_name = ['\', '/', ':', '*', '?', '"', '<', '>', '|']        
tmp_file_name= file

for c in char_not_supported_by_file_name:    
    if c in tmp_file_name:    
        tmp_file_name = tmp_file_name.replace(c, '_')

I try to write this list, check if the file's name I want to clean up has one of the 9 special characters I don't want and replace it with an underscore, but my IDE says the array is written wrong. How can I write it in the correct way?

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • because this `'\'` is not a valid string. It needs to be `'\\'` this. Also you could use [`re.sub`](https://docs.python.org/3/library/re.html#re.sub) instead, and replace all occurences of all chars in one command – Tomerikoo Jul 25 '19 at 21:25

3 Answers3

0

If you precede a quote with a backslash, it will have been escaped. In other words, it will be a character in the string instead of marking the end of the string. You must escape the first backslash with another backslash:

char_not_supported_by_file_name = ['\\', '/', ':', '*', '?', '"', '<', '>', '|']

Also, replace will do nothing if it can't find any instances of the character that needs to be replaced, so you can omit the if check:

for c in char_not_supported_by_file_name:
    tmp_file_name = tmp_file_name.replace(c, '_')
iz_
  • 15,923
  • 3
  • 25
  • 40
0

Something that will make your code more concise, if you're comfortable with regex, would be using regular expressions instead of an array:

import re

tmp_file_name = file
tmp_file_name = re.sub(r'[\\/:*?\"<>|]', '_', tmp_file_name)

This solves your original problem as well, which is that the backslash in the first element of your array, '\', is escaping the end quote and turning it into a ' literal instead of closing the quotations around your backslash.

lambdawaff
  • 141
  • 5
0

If you are willing to import modules, this could be done without the loop, using re.sub:

import re
file_name = "this/is:a*very?bad\\example>of<a|filename"

res = re.sub("[\\\/:*?\"<>|]", "_", file_name)
print(res)
# this_is_a_very_bad_example_of_a_filename

Note the \ backslashes need to be tripled or even quadropled depending on the exact location. Read this question and its duplicates for more information. The reason is that those backslashes are escaped twice: once by the interpreter and then again by re.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61