-1

Given the following files:

E:/Media/Foo/info.nfo
E:/Media/Bar/FXGâ¢.nfo

I can "find" them with the following:

BASE = r'E:/Media/'

for dirpath, _, files in os.walk(BASE):
    for f in fnmatch.filter(files, '*.nfo'):
        nfopath = os.path.join(dirpath, f)
        print(nfopath)

This snippet would then print the above paths.

However, if I make sure that each path created by os.path.join() is indeed a regular file -- for example with something like:

for dirpath, _, files in os.walk(BASE):
    for f in fnmatch.filter(files, '*.nfo'):
        nfopath = os.path.join(dirpath, f)
        print(nfopath)
        assert os.path.isfile(nfopath)   # <------

The assertion fails for the second filename, but not for the first.

I checked the folder in explorer, and the script indeed found a regular file and printed the name and path correctly, so I'm not clear on why the assertion failed.

I've tried specifying the BASE string as a unicode string (ur'E:/Media/') as well as explicitly encoding the nfopath inside the isfile() call (assert os.path.isfile(nfopath.encode('utf-8')).

Neither seemed to work.

Of course, I could keep track of and manually go through and delete the failing files, but I'm interested in how one would handle this correctly.

Thanks in advance.

(Python 2.7, Windows 7)

jedwards
  • 29,432
  • 3
  • 65
  • 92

1 Answers1

1

According to this SO question, Windows stores file names as UTF-16 when using the NTFS filesystem. Retry your encoding step with UTF-16.

Community
  • 1
  • 1
skrrgwasme
  • 9,358
  • 11
  • 54
  • 84