Python decode unknown character

Question

I'm trying to decode the following: UKLTD� For into utf-8 (or anything really) but I cannot workout how to do it and keep getting errors like

'ascii' codec can't decode byte 0xae in position 8: ordinal not in range(128)

I'm reading from a csv and have the following:

with open(path_to_file, 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        order = Order(
           ...
           product_name = row[11].encode('utf-8'),
           ...
        )
        order.save()

I would be happy right now to just ignore the character if I have keep the rest of the string.

`0xae` is also not a valid UTF-8 byte. It might be another character set (ISO-8859-1 perhaps?) Do you know the character is is supposed to be? In ISO-8859-1 it is ⓡ (registered sign). — Bart Friederichs, Dec 22 '17 at 12:34
use try except and in except use `product_name = row[11].encode('utf-16')` — Usman Maqbool, Dec 22 '17 at 12:39
maybe it can help you: https://stackoverflow.com/questions/21129020/how-to-fix-unicodedecodeerror-ascii-codec-cant-decode-byte — Adi Ep, Dec 22 '17 at 13:53
https://stackoverflow.com/questions/21129020/how-to-fix-unicodedecodeerror-ascii-codec-cant-decode-byte — Adi Ep, Dec 22 '17 at 13:54
this sorted it: `product_name = row[11].decode('iso-8859-1').encode('utf8')` — HenryM, Dec 22 '17 at 13:57
@HenryM you could answer your own question. This way, other people might be helped as well. — Bart Friederichs, Dec 22 '17 at 13:59

score 0 · Accepted Answer · answered Dec 22 '17 at 17:35

0

Thank you to @BartFriederichs.

Solution is : product_name = row[11].decode('iso-8859-1').encode('utf8')

answered Dec 22 '17 at 17:35

HenryM

5,557
7
49
105

Python decode unknown character

1 Answers1

Linked