get standard encodings out of python

Question

So I have a bytes object but not sure of its encoding, but know it is not utf-8:

a.decode('utf-8')

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9a in position 0: invalid start byte

What I would like to do is something like:

for encoding in encodings:
    try:
        a.decode(encoding)
        print("This is it!", encoding)
    except Exception:
        pass

How do you get Python to give you everything that will go into .decode as a list encodings so I can plug it in there?

https://github.com/tripleee/8bit has code which does this. I recall seeing this kind of question several times before but cannot immediately find a good duplicate. — tripleee, Feb 13 '19 at 10:06
BTW you do not exit for then you find the encoding. But no, many encoding could successfully decode a byte arrays. Most 1 byte encoding have no structure, so every one could decode your string. You need to be smarter, And possibly not reinventing the wheel. — Giacomo Catenazzi, Feb 13 '19 at 15:06

score 2 · Accepted Answer · answered Feb 13 '19 at 10:07

2

You can get them like this:

import encodings
all_of_encodings = encodings.aliases.aliases.keys()

for encoding in all_of_encodings:
    # do what you want

answered Feb 13 '19 at 10:07

Mehrdad Pedramfar

2

As outlined in the proposed duplicate, this is actually not a good solution. – tripleee Feb 13 '19 at 10:08
This does not cover the entire [list of all aliases and standard encodings](https://docs.python.org/3.7/library/codecs.html#standard-encodings). I ran the code and it didn't match. – ingyhere Jan 28 '20 at 05:56

1 Answers1