How to (literally) read a file character by character?

Question

As the question described, I want to read a text file character by character.

I have a large file that is mostly text but also contains some illegitimate bytes that is not accepted by Python, currently I don't have the time to figure out what is actually wrong, so I just want to skip all the problematic bytes using try.

with open(filein,'r',encoding='ascii') as file:
    while True:
        try:
            char = file.read(1)
        except UnicodeDecodeError:
            continue

        if not char:
            break

        print(char)

However this doesn't work as it just skip over all the bytes and outputs nothing.

My instinct thinks that it's because everytime I call READ it reads the file entirely before cropping it, and considers it as an Error.

So I was wondering if there is a way to literally read a single char out of a file in Python, kinda like fgetc() in C?

Without the break, this code will run forever. can you elaborate what you consider an illegal byte? — Rafael W., Jun 16 '20 at 14:39
@Rafael They're using ASCII, so that means any byte with more than 7 bits is illegal. — wjandrea, Jun 16 '20 at 14:44
@poke Actually with a file in text mode, `read(1)` reads a single *character*, though that wasn't reflected [in the tutorial](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects) until recently. I actually [submitted that correction myself](https://github.com/python/cpython/pull/13852). Either way, ASCII is a single-byte encoding, so it's a moot point. — wjandrea, Jun 16 '20 at 15:19

How to (literally) read a file character by character?

0 Answers0