0

I wrote a simple script to rename a bunch of files in a directory:

import os

crash_course_dir = ...
os.chdir(crash_course_dir)

for filename in os.listdir('.'):
    dot_idx = filename.index('.')
    new_file_name = filename[:dot_idx].strip() + ' . ' + filename[dot_idx + 1:].strip()
    print 'before:\t', filename, '\nafter:\t', new_file_name, '\n\n'
    os.rename(filename, new_file_name)

and it works as I expected except for one thing:

after:  7 . �2,000 Years of Chinese History! The Mandate of Heaven and Confucius - World History.mp4

This was the output in my console, but when I look inside the directory, all I see is 7 . ‎2,000 Years of Chinese History! The Mandate of Heaven and Confucius - World History.mp4

This is the only file (out of 42) that shows this weird char (as far as I can see).


I added this check:

if new_file_name[0] == '7':
    print new_file_name[4], ord(new_file_name[4])

Output: � 253

Why is this happening? Someone mistakenly added this char and because it's not windows readable, no-one noticed? Leaving this char could cause problems? (I can remove it using this)

I use Python 2.7 with Spyder. Windows 8.1

CIsForCookies
  • 12,097
  • 11
  • 59
  • 124

1 Answers1

0

Try decoding the file name with UTF-8 format.

print filename.decode('utf-8')
Ilayaraja
  • 2,388
  • 1
  • 9
  • 9