25

I need to convert an ASCII string into a list of bits and vice versa:

str = "Hi" -> [0,1,0,0,1,0,0,0,0,1,1,0,1,0,0,1]

[0,1,0,0,1,0,0,0,0,1,1,0,1,0,0,1] -> "Hi"
Yu Hao
  • 119,891
  • 44
  • 235
  • 294
Dan
  • 1,516
  • 2
  • 16
  • 26

11 Answers11

39

There are many ways to do this with library functions. But I am partial to the third-party bitarray module.

>>> import bitarray
>>> ba = bitarray.bitarray()

Conversion from strings requires a bit of ceremony. Once upon a time, you could just use fromstring, but that method is now deprecated, since it has to implicitly encode the string into bytes. To avoid the inevitable encoding errors, it's better to pass a bytes object to frombytes. When starting from a string, that means you have to specify an encoding explicitly -- which is good practice anyway.

>>> ba.frombytes('Hi'.encode('utf-8'))
>>> ba
bitarray('0100100001101001')

Conversion to a list is easy. (Also, bitstring objects have a lot of list-like functions already.)

>>> l = ba.tolist()
>>> l
[False, True, False, False, True, False, False, False, 
 False, True, True, False, True, False, False, True]

bitstrings can be created from any iterable:

>>> bitarray.bitarray(l)
bitarray('0100100001101001')

Conversion back to bytes or strings is relatively easy too:

>>> bitarray.bitarray(l).tobytes().decode('utf-8')
'Hi'

And for the sake of sheer entertainment:

>>> def s_to_bitlist(s):
...     ords = (ord(c) for c in s)
...     shifts = (7, 6, 5, 4, 3, 2, 1, 0)
...     return [(o >> shift) & 1 for o in ords for shift in shifts]
... 
>>> def bitlist_to_chars(bl):
...     bi = iter(bl)
...     bytes = zip(*(bi,) * 8)
...     shifts = (7, 6, 5, 4, 3, 2, 1, 0)
...     for byte in bytes:
...         yield chr(sum(bit << s for bit, s in zip(byte, shifts)))
... 
>>> def bitlist_to_s(bl):
...     return ''.join(bitlist_to_chars(bl))
... 
>>> s_to_bitlist('Hi')
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
>>> bitlist_to_s(s_to_bitlist('Hi'))
'Hi'
senderle
  • 145,869
  • 36
  • 209
  • 233
  • Thanks for the great reply, unfortunately my environment doesn't seem to have this module available, so John's answer fits my use case better. – Dan Apr 19 '12 at 23:17
31

There are probably faster ways to do this, but using no extra modules:

def tobits(s):
    result = []
    for c in s:
        bits = bin(ord(c))[2:]
        bits = '00000000'[len(bits):] + bits
        result.extend([int(b) for b in bits])
    return result

def frombits(bits):
    chars = []
    for b in range(len(bits) / 8):
        byte = bits[b*8:(b+1)*8]
        chars.append(chr(int(''.join([str(bit) for bit in byte]), 2)))
    return ''.join(chars)
Simon Streicher
  • 2,638
  • 1
  • 26
  • 30
John Gaines Jr.
  • 11,174
  • 1
  • 25
  • 25
12

not sure why, but here are two ugly oneliners using only builtins:

s = "Hi"
l = map(int, ''.join([bin(ord(i)).lstrip('0b').rjust(8,'0') for i in s]))
s = "".join(chr(int("".join(map(str,l[i:i+8])),2)) for i in range(0,len(l),8))

yields:

>>> l
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
>>> s
'Hi'

In real world code, use the struct or the bitarray module.

ch3ka
  • 11,792
  • 4
  • 31
  • 28
  • 2
    lol, now I know why I should not use "l" as a variable name... never had those problems with my console font :D – ch3ka Apr 19 '12 at 23:17
  • 1
    you're welcome. But note that I won't use something ugly like this in production code, if I were you ;) – ch3ka Apr 19 '12 at 23:26
6

You could use the built-in bytearray:

>>> for i in bytearray('Hi', 'ascii'):
...     print(i)
... 
72
105

>>> bytearray([72, 105]).decode('ascii')
'Hi'

And bin() to convert to binary.

Rik Poggi
  • 28,332
  • 6
  • 65
  • 82
4
def text_to_bits(text):
    """
    >>> text_to_bits("Hi")
    [0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
    """
    bits = bin(int.from_bytes(text.encode(), 'big'))[2:]
    return list(map(int, bits.zfill(8 * ((len(bits) + 7) // 8))))

def text_from_bits(bits):
    """
    >>> text_from_bits([0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1])
    'Hi'
    """
    n = int(''.join(map(str, bits)), 2)
    return n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()

See also, Convert Binary to ASCII and vice versa (Python).

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
1
def to_bin(string):
    res = ''
    for char in string:
        tmp = bin(ord(char))[2:]
        tmp = '%08d' %int(tmp)
        res += tmp
    return res

def to_str(string):
    res = ''
    for idx in range(len(string)/8):
        tmp = chr(int(string[idx*8:(idx+1)*8], 2))
        res += tmp
    return res

These function is really simple.
It doesn't use third party module.

Damotorie
  • 586
  • 7
  • 25
0

A few speed comparisons. Each of these were run using

python -m timeit "code"

or

cat <<-EOF | python -m timeit
    code
EOF

if multiline.

Bits to Byte

A: 100000000 loops, best of 3: 0.00838 usec per loop

res = 0
for idx,x in enumerate([0,0,1,0,1,0,0,1]):
    res |= (x << idx)

B: 100000000 loops, best of 3: 0.00838 usec per loop

int(''.join(map(str, [0,0,1,0,1,0,0,1])), 2)

Byte to Bits

A: 100000000 loops, best of 3: 0.00836 usec per loop

[(41 >> x) & 1 for x in range(7, -1, -1)]

B: 100000 loops, best of 3: 2.07 usec per loop

map(int, bin(41)[2:])
d0c_s4vage
  • 3,947
  • 6
  • 23
  • 32
0
import math

class BitList:
    def __init__(self, value):
        if isinstance(value, str):
            value = sum([bytearray(value, "utf-8")[-i - 1] << (8*i) for i in range(len(bytearray(value, "utf-8")))])
        try:
            self.value = sum([value[-i - 1] << i for i in range(len(value))])
        except Exception:
            self.value = value

    def __getitem__(self, index):
        if isinstance(index, slice):
            if index.step != None and index.step != 1:
                return list(self)[index]
            else:
                start = index.start if index.start else 0
                stop = index.stop if index.stop != None else len(self)

                return BitList(math.floor((self.value % (2 ** (len(self) - start))) >> (len(self) - stop)))
        else:
            return bool(self[index:index + 1].value)

    def __len__(self):
        return math.ceil(math.log2(self.value + 1))

    def __str__(self):
        return self.value

    def __repr__(self):
        return "BitList(" + str(self.value) + ")"

    def __iter__(self):
        yield from [self[i] for i in range(len(self))]

Then you can initialize BitList with a number or a list (of numbers or booleans), then you can get its value, get positional items, get slices, and convert it to a list. Note: Cannot currently set items, but when I add that I will edit this post.

I made this my self, then went looking for how to convert a string (or a file) into a list of bits, then figured that out from another answer.

Solomon Ucko
  • 5,724
  • 3
  • 24
  • 45
0

This might work, but it does not work if you ask PEP 8 (long line, complex)

tobits = lambda x: "".join(map(lambda y:'00000000'[len(bin(ord(y))[2:]):]+bin(ord(y))[2:],x))
frombits = lambda x: ''.join([chr(int(str(y), 2)) for y in [x[y:y+8] for y in range(0,len(x),8)]])

These are used like normal functions.

Lil Taco
  • 515
  • 5
  • 9
0

Because I like generators, I'll post my version here:

def bits(s):
    for c in s:
        yield from (int(bit) for bit in bin(ord(c))[2:].zfill(8))


def from_bits(b):
    for i in range(0, len(b), 8): 
        yield chr(int(''.join(str(bit) for bit in b[i:i + 8]), 2)) 


print(list(bits('Hi')))
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
print(''.join(from_bits([0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1])))
Hi
antonagestam
  • 4,532
  • 3
  • 32
  • 44
-1

If you have bits in a list then you simply convert it into str and then to a number. Number will behave like a bit string and then bitwise operation can be applied. For example :

int(str([1,0,0,1]) | int(str([1,0,1,1])
Onur A.
  • 3,007
  • 3
  • 22
  • 37