26

I'm reading some MPEG Transport Stream protocol over UDP and it has some funky bitfields in it (length 13 for example). I'm using the "struct" library to do the broad unpacking, but is there a simple way to say "Grab the next 13 bits" rather than have to hand-tweak the bit manipulation? I'd like something like the way C does bit fields (without having to revert to C).

Suggestions?

Grace Note
  • 3,205
  • 4
  • 35
  • 55
ZebZiggle
  • 678
  • 2
  • 9
  • 11

2 Answers2

27

The bitstring module is designed to address just this problem. It will let you read, modify and construct data using bits as the basic building blocks. The latest versions are for Python 2.6 or later (including Python 3) but version 1.0 supported Python 2.4 and 2.5 as well.

A relevant example for you might be this, which strips out all the null packets from a transport stream (and quite possibly uses your 13 bit field?):

from bitstring import Bits, BitStream  

# Opening from a file means that it won't be all read into memory
s = Bits(filename='test.ts')
outfile = open('test_nonull.ts', 'wb')

# Cut the stream into 188 byte packets
for packet in s.cut(188*8):
    # Take a 13 bit slice and interpret as an unsigned integer
    PID = packet[11:24].uint
    # Write out the packet if the PID doesn't indicate a 'null' packet
    if PID != 8191:
        # The 'bytes' property converts back to a string.
        outfile.write(packet.bytes)

Here's another example including reading from bitstreams:

# You can create from hex, binary, integers, strings, floats, files...
# This has a hex code followed by two 12 bit integers
s = BitStream('0x000001b3, uint:12=352, uint:12=288')
# Append some other bits
s += '0b11001, 0xff, int:5=-3'
# read back as 32 bits of hex, then two 12 bit unsigned integers
start_code, width, height = s.readlist('hex:32, 2*uint:12')
# Skip some bits then peek at next bit value
s.pos += 4
if s.peek(1):
    flags = s.read(9)

You can use standard slice notation to slice, delete, reverse, overwrite, etc. at the bit level, and there are bit level find, replace, split etc. functions. Different endiannesses are also supported.

# Replace every '1' bit by 3 bits
s.replace('0b1', '0b001')
# Find all occurrences of a bit sequence
bitposlist = list(s.findall('0b01000'))
# Reverse bits in place
s.reverse()

The full documentation is here.

Scott Griffiths
  • 21,438
  • 8
  • 55
  • 85
  • I think the packet[11:24].uint should be packet[12:24].uint. The field is 13 bits long, starts at bit 12, ends at bit 24. –  Jul 17 '12 at 14:59
  • This should really have been a comment and not an answer, but no, it really is [11:24]. Indices are zero based and are non-inclusive of the end index (which is standard usage in Python and many other languages). So a slice of just the first bit would be [0:1], whereas [12:24] would be a 12 bit slice from the 13th to the 24th bit inclusive. Note that the length is always the difference between the two indices. – Scott Griffiths Aug 02 '12 at 10:01
9

It's an often-asked question. There's an ASPN Cookbook entry on it that has served me in the past.

And there is an extensive page of requirements one person would like to see from a module doing this.

Thomas Vander Stichele
  • 36,043
  • 14
  • 56
  • 60