UnicodeDecodeError: 'utf-8' codec can't decode byte : but I don't know where in the code

Question

I am running a python code in spyder that doesn't print anything and that doesn't manage any strings. But I have of course comments in my code (in which I probably have accents somewhere).

When I want to run the code, it tells me :

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 148: invalid continuation byte

The problem I have is that I just don't know on which line it crashes, indeed the errors give me lines in anaconda python file, it doesn't tell me where my mistake is in my own code.

What's more, in fact, the code worked perfeclty well. Then my pc crashed and I rebooted it and now I have this error, but nothing changed in my code between these.

All the help about this topic always concern strings that have some strange characters but I insist on the fact that my code doesn't print anything and I don't manage any string (only matrix of numbers).

The code is quite long and I struggle to identify precisely where it fails, I would like at first know if someone have a general idea about it (the main fact is that I don't work with any string in my code, the only special characters that could be here would be in the comments)

At the very least, show us how you're calling anaconda. When a package you're using is generating errors, it indicates you're using it wrong. — Mark Ransom, Dec 15 '17 at 17:23
@MarkRansom in fact I am using spyder inside of anaconda. So I just press F5 to run the script (I don't know if that was your question, I'm quite a newbie with python) — StarBucK, Dec 15 '17 at 17:27
You say that it doesn't tell you where the mistake is, but if Python can't parse your code (that is, the problem is at compile time, not at runtime), then it _does_ tell you where the mistake is -- at position 148 of the file. — DSM, Dec 15 '17 at 17:30
It means the 148th byte of the file. Or actually the 149th, since the 1st is byte 0. — Mark Ransom, Dec 16 '17 at 04:19
@MarkRansom but is there a way to easily convert it in lines ? To be able to identify where the error is ? Because like that it is not really helpfull — StarBucK, Dec 16 '17 at 10:38

score 1 · Answer 1 · answered Dec 19 '17 at 06:40

It seems that the error might be in your source code itself. The error message is not terribly useful for pointing out exactly where the error is. This code fragment (in Python 3) can be used to print out the line that contains the error.

error_position = 148
file_position = 0
with open(source_filename, 'r', encoding='latin1') as f:
    for line in f:
        if file_position <= error_position < file_position + len(line):
            print(line.rstrip('\n').encode('latin1').decode('unicode-escape'))
            spaces = ' ' * (error_position - file_position)
            print(spaces, '^', sep='')
            break
        file_position += len(line)

You need to tell Python what encoding is used for the source file if it uses any characters not in the default. The default for Python 2 is ASCII and the default for Python 3 is UTF-8. For anything else you can add a magic comment to the first or second line of your source file, e.g.:

# -*- coding: windows-1252 -*-

For full details see PEP 263.

As for why your program didn't error until you rebooted, I have a theory. If you're working in an interactive Python prompt and use import to load your program, any updates to the program won't be seen even if you save the file and use import again. That's because Python caches imports; see How to unimport a python module which is already imported?

UnicodeDecodeError: 'utf-8' codec can't decode byte : but I don't know where in the code

1 Answers1