96

Using this code to take a string and convert it to binary:

bin(reduce(lambda x, y: 256*x+y, (ord(c) for c in 'hello'), 0))

this outputs:

0b110100001100101011011000110110001101111

Which, if I put it into this site (on the right hand site) I get my message of hello back. I'm wondering what method it uses. I know I could splice apart the string of binary into 8's and then match it to the corresponding value to bin(ord(character)) or some other way. Really looking for something simpler.

serv-inc
  • 35,772
  • 9
  • 166
  • 188
sbrichards
  • 2,169
  • 2
  • 19
  • 32
  • 1
    So is your question, "is there a more succinct way to do the inverse of my code than the obvious"? – tripleee Sep 13 '11 at 05:10
  • 1
    related: [`b2a_bin` extension in Cython](http://stackoverflow.com/a/12162190/4279) allows to create binary strings (`"01"`) directly from bytestrings without creating an intermediate Python integer. – jfs Nov 16 '13 at 05:59

8 Answers8

174

For ASCII characters in the range [ -~] on Python 2:

>>> import binascii
>>> bin(int(binascii.hexlify('hello'), 16))
'0b110100001100101011011000110110001101111'

In reverse:

>>> n = int('0b110100001100101011011000110110001101111', 2)
>>> binascii.unhexlify('%x' % n)
'hello'

In Python 3.2+:

>>> bin(int.from_bytes('hello'.encode(), 'big'))
'0b110100001100101011011000110110001101111'

In reverse:

>>> n = int('0b110100001100101011011000110110001101111', 2)
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()
'hello'

To support all Unicode characters in Python 3:

def text_to_bits(text, encoding='utf-8', errors='surrogatepass'):
    bits = bin(int.from_bytes(text.encode(encoding, errors), 'big'))[2:]
    return bits.zfill(8 * ((len(bits) + 7) // 8))

def text_from_bits(bits, encoding='utf-8', errors='surrogatepass'):
    n = int(bits, 2)
    return n.to_bytes((n.bit_length() + 7) // 8, 'big').decode(encoding, errors) or '\0'

Here's single-source Python 2/3 compatible version:

import binascii

def text_to_bits(text, encoding='utf-8', errors='surrogatepass'):
    bits = bin(int(binascii.hexlify(text.encode(encoding, errors)), 16))[2:]
    return bits.zfill(8 * ((len(bits) + 7) // 8))

def text_from_bits(bits, encoding='utf-8', errors='surrogatepass'):
    n = int(bits, 2)
    return int2bytes(n).decode(encoding, errors)

def int2bytes(i):
    hex_string = '%x' % i
    n = len(hex_string)
    return binascii.unhexlify(hex_string.zfill(n + (n & 1)))

Example

>>> text_to_bits('hello')
'0110100001100101011011000110110001101111'
>>> text_from_bits('110100001100101011011000110110001101111') == u'hello'
True
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • 3
    @J.F.Sebastian i tried this method with the python current version and it seems that it does not work.
    TypeError: 'str' does not support the buffer interface
    Would you update your answer
    – hamza Nov 13 '12 at 19:30
  • 3
    @hamza: It works on Python 2. On Python 3 you should convert str to bytes first e.g., `your_string.encode('ascii', 'strict')` – jfs Nov 13 '12 at 19:32
  • 1
    @J.F.Sebasitian: thanks, however when i tried it vice versa the unhexlify funtion return an error message: binascii.Error: Odd-length string. – hamza Nov 14 '12 at 12:28
  • 3
    @hamza: prepend it with `'0'` if hex-string's length is not even. It happens if the first character in the original string has ascii code less than 16 e.g., `'\n'` or `'\t'`. Odd-length never happens for ascii letters `[ -~]`. – jfs Nov 14 '12 at 15:59
33

Built-in only python

Here is a pure python method for simple strings, left here for posterity.

def string2bits(s=''):
    return [bin(ord(x))[2:].zfill(8) for x in s]

def bits2string(b=None):
    return ''.join([chr(int(x, 2)) for x in b])

s = 'Hello, World!'
b = string2bits(s)
s2 = bits2string(b)

print 'String:'
print s

print '\nList of Bits:'
for x in b:
    print x

print '\nString:'
print s2

String:
Hello, World!

List of Bits:
01001000
01100101
01101100
01101100
01101111
00101100
00100000
01010111
01101111
01110010
01101100
01100100
00100001

String:
Hello, World!
tmthydvnprt
  • 10,398
  • 8
  • 52
  • 72
10

I'm not sure how you think you can do it other than character-by-character -- it's inherently a character-by-character operation. There is certainly code out there to do this for you, but there is no "simpler" way than doing it character-by-character.

First, you need to strip the 0b prefix, and left-zero-pad the string so it's length is divisible by 8, to make dividing the bitstring up into characters easy:

bitstring = bitstring[2:]
bitstring = -len(bitstring) % 8 * '0' + bitstring

Then you divide the string up into blocks of eight binary digits, convert them to ASCII characters, and join them back into a string:

string_blocks = (bitstring[i:i+8] for i in range(0, len(bitstring), 8))
string = ''.join(chr(int(char, 2)) for char in string_blocks)

If you actually want to treat it as a number, you still have to account for the fact that the leftmost character will be at most seven digits long if you want to go left-to-right instead of right-to-left.

agf
  • 171,228
  • 44
  • 289
  • 238
2

This is my way to solve your task:

str = "0b110100001100101011011000110110001101111"
str = "0" + str[2:]
message = ""
while str != "":
    i = chr(int(str[:8], 2))
    message = message + i
    str = str[8:]
print message
2

if you don'y want to import any files you can use this:

with open("Test1.txt", "r") as File1:
St = (' '.join(format(ord(x), 'b') for x in File1.read()))
StrList = St.split(" ")

to convert a text file to binary.

and you can use this to convert it back to string:

StrOrgList = StrOrgMsg.split(" ")


for StrValue in StrOrgList:
    if(StrValue != ""):
        StrMsg += chr(int(str(StrValue),2))
print(StrMsg)

hope that is helpful, i've used this with some custom encryption to send over TCP.

Kyle Burns
  • 157
  • 2
  • 15
1

Are you looking for the code to do it or understanding the algorithm?

Does this do what you need? Specifically a2b_uu and b2a_uu? There are LOTS of other options in there in case those aren't what you want.

(NOTE: Not a Python guy but this seemed like an obvious answer)

Jaxidian
  • 13,081
  • 8
  • 83
  • 125
  • I've been researching it for a bit, binascii isn't working for me, and mostly looking for the code, if I can see it I can understand it. Thanks though EDIT: when converting ascii to binary using binascii a2b_uu for "h" is \x00\x00\x00\x00\x00\x00\x00\x00 which is not what I need, I need 'hello' and actual 1's and 0's not shellcode looking ascii, also it only works char by char – sbrichards Sep 13 '11 at 04:43
  • @Jaxidian that was quite helpful for my purposes. Someone stored some data in a string and I have it. I am quite sure it's a 64binary b/c of the padding. I can successfully use `b2a_base64` on that, however the result is, indeed, confusing at best. How do I get a list of boolleans/integers (0,1) from there? – Ufos Feb 15 '18 at 17:15
0

Convert binary to its equivalent character.

k=7
dec=0
new=[]
item=[x for x in input("Enter 8bit binary number with , seprator").split(",")]
for i in item:
    for j in i:
        if(j=="1"):
            dec=2**k+dec
            k=k-1
        else:
            k=k-1
    new.append(dec)
    dec=0
    k=7
print(new)
for i in new:
    print(chr(i),end="")
-1

This is a spruced up version of J.F. Sebastian's. Thanks for the snippets though J.F. Sebastian.

import binascii, sys
def goodbye():
    sys.exit("\n"+"*"*43+"\n\nGood Bye! Come use again!\n\n"+"*"*43+"")
while __name__=='__main__':
    print "[A]scii to Binary, [B]inary to Ascii, or [E]xit:"
    var1=raw_input('>>> ')
    if var1=='a':
        string=raw_input('String to convert:\n>>> ')
        convert=bin(int(binascii.hexlify(string), 16))
        i=2
        truebin=[]
        while i!=len(convert):
            truebin.append(convert[i])
            i=i+1
        convert=''.join(truebin)
        print '\n'+'*'*84+'\n\n'+convert+'\n\n'+'*'*84+'\n'
    if var1=='b':
        binary=raw_input('Binary to convert:\n>>> ')
        n = int(binary, 2)
        done=binascii.unhexlify('%x' % n)
        print '\n'+'*'*84+'\n\n'+done+'\n\n'+'*'*84+'\n'
    if var1=='e':
        aus=raw_input('Are you sure? (y/n)\n>>> ')
        if aus=='y':
            goodbye()
TH33L1T3
  • 7
  • 1