9

I have 23 bits represented as a string, and I need to write this string to a binary file as 4 bytes. The last byte is always 0. The following code works (Python 3.3), but it doesn't feel very elegant (I'm rather new to Python and programming). Do you have any tips of making it better? It seems a for-loop might be useful, but how do I do the slicing within the loop without getting a IndexError? Note that that when I extract the bits into a byte, I reverse the bit-order.

from array import array

bin_array = array("B")
bits = "10111111111111111011110"    #Example string. It's always 23 bits
byte1 = bits[:8][::-1]
byte2 = bits[8:16][::-1]
byte3 = bits[16:][::-1]
bin_array.append(int(byte1, 2))
bin_array.append(int(byte2, 2))
bin_array.append(int(byte3, 2))
bin_array.append(0)

with open("test.bnr", "wb") as f:
    f.write(bytes(bin_array))

# Writes [253, 255, 61, 0] to the file
Olav
  • 547
  • 1
  • 5
  • 17

4 Answers4

18

You can treat it as an int, then create the 4 bytes as follows:

>>> bits = "10111111111111111011110"
>>> int(bits[::-1], 2).to_bytes(4, 'little')
b'\xfd\xff=\x00'
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • @Jon That is... amazing. Is it possbile to go the other way? Something like: `int.from_bytes(b'\xfd\xff=\x00', 'little')` and get `"10111111111111111011110"` – Olav Jan 20 '14 at 10:18
  • 1
    @Olav, yup - format it appropriately: `format(int.from_bytes(b'\xfd\xff=\x00', 'little'), '023b')[::-1]` – Jon Clements Jan 20 '14 at 10:45
  • This question gets asked a lot of times on this site, but this is the only reasonable solution out of any of the answers thank you – ICW Jul 20 '18 at 18:59
  • 1
    @YungGun: [`int.to_bytes()`](https://docs.python.org/3/library/stdtypes.html#int.to_bytes) wasn't added to Python until version 3.2, so for compatibility with current and older versions of the language using the `struct` module, as shown in [my answer](https://stackoverflow.com/a/21221295/355230), might be preferable since it will work in both Python 2.x and 3.x. – martineau Oct 06 '18 at 19:21
  • 1
    @Kebman not really... see https://en.wikipedia.org/wiki/Bit_numbering#Most-_vs_least-significant_bit_first – Jon Clements Sep 18 '20 at 15:34
  • This does not work with really long bitstrings. – Genfood Jun 16 '23 at 12:15
8

The struct module was designed for exactly this sort of thing — consider the following in which the conversion to bytes has been broken down into some unnecessary intermediate steps to make understanding it clearer:

import struct

bits = "10111111111111111011110"  # example string. It's always 23 bits
int_value = int(bits[::-1], base=2)
bin_array = struct.pack('i', int_value)
with open("test.bnr", "wb") as f:
    f.write(bin_array)

A harder-to-read, but shorter, way would be:

bits = "10111111111111111011110"  # example string. It's always 23 bits
with open("test.bnr", "wb") as f:
    f.write(struct.pack('i', int(bits[::-1], 2)))
martineau
  • 119,623
  • 25
  • 170
  • 301
1
from array import array

bin_array = array("B")
bits = "10111111111111111011110"

bits = bits + "0" * (32 - len(bits))  # Align bits to 32, i.e. add "0" to tail
for index in range(0, 32, 8):
    byte = bits[index:index + 8][::-1]
    bin_array.append(int(byte, 2))

with open("test.bnr", "wb") as f:
    f.write(bytes(bin_array))
Dmitry Vakhrushev
  • 1,382
  • 8
  • 12
1

You can perform the split in one line using re.findall method:

>>>bits = "10111111111111111011110"
>>>import re
>>>re.findall(r'\d{1,8}', bits)
['10111111', '11111111', '1011110']

As an algorithm, you can pad bits to length 32 and then use re.findall method to group it in octects:

>>> bits
'10111111111111111011110000000000'
>>> re.findall(r'\d{8}', bits)
['10111111', '11111111', '10111100', '00000000']

Your code would be like this:

import re
from array import array

bin_array = array("B")
bits = "10111111111111111011110".ljust(32, '0')  # pad it to length 32

for octect in re.findall(r'\d{8}', bits): # split it in 4 octects
    bin_array.append(int(octect[::-1], 2)) # reverse them and append it

with open("test.bnr", "wb") as f:
    f.write(bytes(bin_array))
Paulo Bu
  • 29,294
  • 6
  • 74
  • 73