Decoding HappyBase data from HBase

Question

While trying to decode the values from HBase, i am seeing an error but it is apparent that Python thinks it is not in UTF-8 format but the Java application that put the data into HBase encoded it in UTF-8 only

a = '\x00\x00\x00\x00\x10j\x00\x00\x07\xe8\x02Y' a.decode("UTF-8") Traceback (most recent call last): File "", line 1, in File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xe8 in position 9: invalid continuation byte

any thoughts?

this is some kind of bytes representation. you should know the original type of data to decode it. looking for solution my self. — Naomi Fridman, May 22 '18 at 21:13

score 0 · Answer 1 · answered Jul 01 '16 at 15:37

0

that data is not valid utf-8, so if you really retrieved it as such from the database, you should check who/what put it in there.

answered Jul 01 '16 at 15:37

wouter bolsterlee

3,879
22
30

Decoding HappyBase data from HBase

1 Answers1