1

I am trying to create a simple file indexing utility in Python 3.4. The goal is to enable quick searching for a filename using an index.

To do this I have used the os.walk function, but I get a UnicodeEncodeError when trying to print the name of a directory. I looked at other questions such as this and this but they don't seem to describe the same error (UnicodeEncodeError).

My code is:

def index_files(path_to_index):
    indexed_files = []

    for dirname, dirnames, filenames in os.walk(path_to_index):
        # print path to all subdirectories first.
        for subdirname in dirnames:
            print(os.path.join(dirname, subdirname))

        # print path to all filenames.
        for filename in filenames:
            full_path = os.path.join(dirname, filename)
            print("Found " + full_path)
            indexed_files.append(full_path)

The output I'm getting is:

Traceback (most recent call last):
  File "[OMITTED LOCAL PATH]indexer.py", line 40, in <module>
    main()
  File "[OMITTED LOCAL PATH]indexer.py", line 37, in main
    indexed_files = index_files(path_to_index)
  File "[OMITTED LOCAL PATH]indexer.py", line 16, in index_files
    print(os.path.join(dirname, subdirname))
  File "C:\Python34\lib\encodings\cp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 31-34: character maps to <undefined>

What is the proper way to do this?

Community
  • 1
  • 1
ose
  • 4,065
  • 2
  • 24
  • 40
  • http://stackoverflow.com/questions/14630288/unicodeencodeerror-charmap-codec-cant-encode-character-maps-to-undefined – Padraic Cunningham Mar 14 '15 at 10:54
  • 1
    At issue here is your console, it is configured to only handle the [CP 850 codepage](https://en.wikipedia.org/wiki/Code_page_850) (MS-DOS western alphabet). You'll need to reconfigure your console. – Martijn Pieters Mar 14 '15 at 10:58

0 Answers0