How to convert a byte array to string?

Question

I just finished creating a huffman compression algorithm . I converted my compressed text from a string to a byte array with bytearray(). Im attempting to decompress my huffman algorithm. My only concern though is that i cannot convert my byte array back into a string. Is there any built in function i could use to convert my byte array (with a variable) back into a string? If not is there a better method to convert my compressed string to something else? I attempted to use byte_array.decode() and I get this:

print("Index: ", Index) # The Index


# Subsituting text to our compressed index

for x in range(len(TextTest)):

    TextTest[x]=Index[TextTest[x]]


NewText=''.join(TextTest)

# print(NewText)
# NewText=int(NewText)


byte_array = bytearray() # Converts the compressed string text to bytes
for i in range(0, len(NewText), 8):
    byte_array.append(int(NewText[i:i + 8], 2))


NewSize = ("Compressed file Size:",sys.getsizeof(byte_array),'bytes')

print(byte_array)

print(byte_array)

print(NewSize)

x=bytes(byte_array)
x.decode()

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 0: invalid start byte

You can convert it to a string by calling the [bytearray.decode()](https://docs.python.org/3/library/stdtypes.html#bytes.decode) method and supplying an encoding. For example: `byte_array.decode('ascii')`. If you leave the decoding argument out, it will default to `'utf-8'`. — martineau, Nov 21 '18 at 07:15
Hey, I got this when i added your code: byte_array.decode('ascii') UnicodeDecodeError: 'ascii' codec can't decode byte 0x88 in position 0: ordinal not in range(128). When I removed the 'ascii' part I got:UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 0: invalid start byte — Mohamed Alremeithi, Nov 23 '18 at 10:11
That means the data in your byte array doesn't contain valid characters in those encodings. You need to find an acceptable one. There's some [here](https://docs.python.org/3/library/codecs.html#binary-transforms) in documentation—`'hex'` might be good. You can also use `'latin1'` which maps the code points 0–255 to the bytes 0x0–0xff. Doing so will allow you to convert the result back to bytes later by using `the_string.encode('latin1')`. I first heard about doing this in [this answer](https://stackoverflow.com/a/22621777/355230) to a unrelated question (to solve a different problem). — martineau, Nov 23 '18 at 10:43

Dorian Turba · Answer 1 · 2021-09-14T14:43:43.037

5

You can use .decode('ascii') (leave empty for utf-8).

>>> print(bytearray("abcd", 'utf-8').decode())
abcd

Source : Convert bytes to a string?

edited Sep 14 '21 at 14:43

answered Nov 21 '18 at 07:18

Dorian Turba

3,260
3
23
67

How to convert a byte array to string?

1 Answers1