0

How can I convert bin to char and char to bin in ASCII (in byte)?

For example if I have : 1010111001100111110010101001011111000001101111011000011

I should have ®Ï*¾° but if I convert this char to bin I will have:

1010111011001111101010101111101101110110000

This binary string is not the same, because for example the char * is obtained with 00101010 when I convert bin to char. But, when I convert the char * to bin I have 101010.

Here is my code:

def bin_to_char(self,text_bin):
        char=''
        stock=''

        for bit in text_bin:
            if len(stock)<8:
                stock+=bit
            elif len(stock)==8:
                print(stock)
                char+=chr(int(stock, 2))
                print(char)
                stock=''

        char+=chr(int(stock, 2)) #add the last binary text less than 8

        return(char)

    def char_to_bin(self,char):
        chbin=''

        for e in char:
            print(e)

            chbin+=format(ord(e), 'b')
            print(chbin)

        return(chbin)
zabop
  • 6,750
  • 3
  • 39
  • 84
Citho
  • 11
  • 3
  • Does this answer your question? [Convert binary to ASCII and vice versa](https://stackoverflow.com/questions/7396849/convert-binary-to-ascii-and-vice-versa) – Riccardo Bucco Mar 23 '20 at 11:30
  • I have already tried and I got this error : `UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 2: invalid continuation byte` when I use this function : ```def text_from_bits(bits, encoding='utf-8', errors='surrogatepass'): n = int(bits, 2) return n.to_bytes((n.bit_length() + 7) // 8, 'big').decode(encoding, errors) or '\0' print(text_from_bits('1010111001100111110010101001011111000001101111011000011'))``` @RiccardoBucco – Citho Mar 23 '20 at 11:47
  • In the binary string, most octets begin with a '1''. This means they are not in he 7-bit ASCII range. What character encoding do these represent. Also, there are 55 binary digits; not evenly divisible by 8. – lit Mar 23 '20 at 15:15
  • @lit yes that's my problem, I get this binary string with the algorithm Huffman. I transformed a DNA sequence with the huffman tree into a binary chain. Each letter in my sequence represents in binary digit its position in the tree. I have Huffman's code `[['C', '00'], ['N', '10'], ['T', '11'], ['A', '011'], [' G ',' 010 ']]` for this sequence NNTNACTTNGNNGTTNCCTATACCT. I now have to convert this binary string `1010111001100111110010101001011111000001101111011000011` into character then vice versa. – Citho Mar 23 '20 at 16:04

1 Answers1

0

The binary string does not represent octets of characters. It must be parsed for each sequence. Here is some code to go from binary string to text string. Then, convert the string back to binary.

The rs variable is the resulting string.

s = '1010111001100111110010101001011111000001101111011000011'
t = s
a = [['C', '00'], ['N', '10'], ['T', '11'], ['A', '011'], ['G','010']]

letter = 0
code = 1
rs = ''

while (t != ''):
    for pair in a:
        if t.startswith(pair[code]):
            t = t[len(pair[code]):]
            rs += pair[letter]
            break

print(rs)

# Convert rs back to binary string

rbs = ''
for c in rs:
    for pair in a:
        if c == pair[letter]:
            rbs += pair[code]
            break

print(rbs)
if (rbs != s): print('ERROR: Conversion failed')
exit(0)

Let me know what kind of grade I get.

lit
  • 14,456
  • 10
  • 65
  • 119