0

I was trying to get the list of name of txt file that was written in Korean in the specified directory with the code below

dir_list = tf.gfile.Glob(engine.TXT_DIR+"/*.txt")

However, This one gives me the following error:

Traceback (most recent call last):
File "D:/Prj_mayDay/Prj_FrankenShtine/shakespear_reborn/main.py", line 108, in <module>
    dir_list = tf.gfile.Glob(engine.TXT_DIR+"/*.txt")
  File "D:\KimKanna's Class\python35\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 326, in get_matching_files
    compat.as_bytes(filename), status)
  File "D:\KimKanna's Class\python35\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 325, in <listcomp>
    for matching_filename in pywrap_tensorflow.GetMatchingFiles(
  File "D:\KimKanna's Class\python35\lib\site-packages\tensorflow\python\util\compat.py", line 106, in as_str_any
    return as_str(value)
  File "D:\KimKanna's Class\python35\lib\site-packages\tensorflow\python\util\compat.py", line 84, in as_text
    return bytes_or_text.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb in position 19: invalid start byte

Now, throughout some research, I found out the reason

The error is because there is some non-ascii character in the dictionary and it can't be encoded/decoded

However, I do not see any way to apply the solution into my code. or is there?

**if there is alternative code for this. It should be applicable for both cloud stroage bucket / my personal hard drive as the code above did.

I'm using python3, Tensorflow version of 1.2.0-rc2

Kanna Kim
  • 383
  • 1
  • 3
  • 15

1 Answers1

0

so after few hours of fiddling around with my code I finally found the solution. Afterall one of the file inside of the directory I specified had a name in Korean. After I took that out of the directory. problem was gone.

Kanna Kim
  • 383
  • 1
  • 3
  • 15