10

I want to parse some data with Python and scapy. Therefor I have to analyse single bits. But at the moment I have for example UDP packets with some payload like:

bytes = b'\x18\x00\x03\x61\xFF\xFF\x00\x05\x42\xFF\xFF\xFF\xFF'

Is there any elegant way to convert the bytes so that I can access single bits like:

bytes_as_bits = convert(bytes)
bit_at_index_42 = bytes_as_bits[42]
Steve Guidi
  • 19,700
  • 9
  • 74
  • 90
N. Stra
  • 103
  • 1
  • 1
  • 6

6 Answers6

11

That will work:

def access_bit(data, num):
    base = int(num // 8)
    shift = int(num % 8)
    return (data[base] >> shift) & 0x1

If you'd like to create a binary array you can use it like this:

[access_bit(data,i) for i in range(len(data)*8)]
Liran Funaro
  • 2,750
  • 2
  • 22
  • 33
7

If you would like to have the bits string, or to spare yourself from creating a function, I would use format() and ord(), let me take a simpler example to illustrate

bytes = '\xf0\x0f'
bytes_as_bits = ''.join(format(ord(byte), '08b') for byte in bytes)

This should output: '1111000000001111'

If you want LSB first you can just flip the output of format(), so:

bytes = '\xf0\x0f'
bytes_as_bits = ''.join(format(ord(byte), '08b')[::-1] for byte in bytes)

This should output: '0000111111110000'

Now you want to use b'\xf0\x0f' instead of '\xf0\x0f'. For python2 the code works the same, but for python3 you have to get rid of ord() so:

bytes = b'\xf0\x0f'
bytes_as_bits = ''.join(format(byte, '08b') for byte in bytes)

And flipping the string is the same issue.

I found the format() functionality here. And the flipping ([::-1]) functionality here.

ZKK
  • 124
  • 1
  • 7
4

Hm, there is no builtin bits type in python, but you can do something like

>>> bin(int.from_bytes(b"hello world", byteorder="big")).lstrip('0b')
'110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

The .lstrip('0b') method will remove any leading '0b' characters in the output of the bin() function.

hello world
  • 596
  • 2
  • 12
2
>>> n=17
>>> [(n & (1<<x))>>x for x in [7,6,5,4,3,2,1,0]]
[0, 0, 0, 1, 0, 0, 0, 1]
Atila Romero
  • 359
  • 3
  • 7
0

To extend @Liran's answer I have added byteorder as an input argument which defaults to 'big'. Note I am not refering to bit packing within the bytes.

def access_bit(b: bytearray, n: int, byteorder: str = "big") -> int:
    """
    Returns the boolean value of the nth bit (n) from the byte array (b).
    The byteorder argument accepts the literal strings ['little', 'big'] and
    refers to the byte order endianness
    """
    base = int(n // 8)
    shift = int(n % 8)
    if byteorder == "big":
        return (b[-base - 1] >> shift) & 0x1
    elif byteorder == "little":
        return (b[base] >> shift) & 0x1
    else:
        raise KeyError("byteorder only recognises 'big' or 'little'")

access_bit(b, 0) returns the least significant bit of the least significant byte assuming big-endian

access_bit(b, 7) returns the most significant bit of the least significant byte assuming big-endian

access_bit(b, 0, 'little') returns the least significant bit of the least significant byte specifying little-endian

access_bit(b, 7) returns the most significant bit of the least significant byte assuming little-endian

Specifying an index n outside the range of the bytearray will result in an error (i.e. access_bit(b'\x05\x01', 16) results in an error as the max index of the bytearray is 15)

lifedroid
  • 164
  • 2
  • 7
0

I would just use a simple lambda expression to convert the bytes to a string:

>>> bytes = b'\x18\x00\x03\x61\xFF\xFF\x00\x05\x42\xFF\xFF\xFF\xFF'
>>> convert = lambda x: f"{int.from_bytes(x, 'big'):b}"
>>> bytes_as_bits = convert(bytes)
>>> bytes_as_bits[42]
'1'
>>> _

'big' is the byteorder to be used. The official python documentation describes it as follows:

The byteorder argument determines the byte order used to represent the integer. If byteorder is "big", the most significant byte is at the beginning of the byte array. If byteorder is "little", the most significant byte is at the end of the byte array. To request the native byte order of the host system, use sys.byteorder as the byte order value.

PythonGuru
  • 21
  • 2