I have a PHP script that creates a list of files in a directory, however, PHP can see only file names in English and totally ignores file names in other languages, such as Russian or Asian languages.
After lots of efforts I found the only solution that could work for me - using a python script that renames the files to UTF8, so the PHP script can process them after that.
(After PHP has finished processing the files, I rename the files to English, I don't keep them in UTF8).
I used the following python script, that works fine:
import sys
import os
import glob
import ntpath
from random import randint
for infile in glob.glob( os.path.join('C:\\MyFiles', u'*') ):
if os.path.isfile(infile):
infile_utf8 = infile.encode('utf8')
os.rename(infile, infile_utf8)
The problem is that it converts also file names that are already in UTF8. I need a way to skip the conversion in case the file name is already in UTF8.
I was trying this python script:
for infile in glob.glob( os.path.join('C:\\MyFiles', u'*') ):
if os.path.isfile(infile):
try:
infile.decode('UTF-8', 'strict')
except UnicodeDecodeError:
infile_utf8 = infile.encode('utf8')
os.rename(infile, infile_utf8)
But, if file name is already in utf8, I get fatal error:
UnicodeDecodeError: 'ascii' codec can't decode characters in position 18-20
ordinal not in range(128)
I also tried another way, which also didn't work:
for infile in glob.glob( os.path.join('C:\\MyFiles', u'*') ):
if os.path.isfile(infile):
try:
tmpstr = str(infile)
except UnicodeDecodeError:
infile_utf8 = infile.encode('utf8')
os.rename(infile, infile_utf8)
I got exactly the same error as before.
Any ideas?
Python is very new to me, and it is a huge effort for me to debug even a simple script, so please write an explicit answer (i.e. code). I don't have the ability of testing general ideas that maybe work or maybe not. Thanks.
Examples of file names:
hello.txt
你好.txt
안녕하세요.html
chào.doc