8

I want to use a bit array in Python that I could use like the standard bitset from C++. Example:

#include<bitset>
int main() {
    std::bitset<100> numBits;
}

However, I don't know if there is something similar in Python, preferably built-in.

martineau
  • 119,623
  • 25
  • 170
  • 301
starkshang
  • 8,228
  • 6
  • 41
  • 52

4 Answers4

3

Normally, you just the builtin int class (or long in python2). Maybe wrap it in a class if you particularly care about hiding the shifts.

o11c
  • 15,265
  • 4
  • 50
  • 75
1

Thre is nothing built-in for that. If you need such a data structure in order to have a proper output of bytes, with the correct bits set, such as for a network protocol, a binary file structure or hardware control, sequencing a list of True and False values to a sequence of Bytes is easily feasible.

One could also create a class to allow direct manypulation of in-memory bits in a bytearray object. However, unlikely what takes place in C++, you won't gain speed or memory (ok, for large bitsets you could gain memory) advantages for that - Python will process each bit as a full reference to the True or False objects (or to full 0 and 1 integers) regardless of what you do in code.

That said, if you have a list with True and False values you want to output to, say, a file, as a sequence of bits, code like this might work:

a = [True, True, False, False, False, True, ...]
with open("myfile.bin", "wb" as file):
    for i, value in enumerate(a):
        if not i % 8:
            if i:
                file.write(byte)
            byte = 0
        byte <<= 1
        byte |= value
     if i % 8:
        byte <<= (8 - i % 8)
        file.write(byte)

A more sophisticated way is to create a full-class support for it, by keeping the values ina bytearray object, and computing each bit index at set and reset operations - a minimalist way of doing that is:

class BitArray(object):
    def __init__(self, lenght):
        self.values = bytearray(b"\x00" * (lenght // 8 + (1 if lenght % 8  else 0)))
        self.lenght = lenght

    def __setitem__(self, index, value):
        value = int(bool(value)) << (7 - index % 8)
        mask = 0xff ^ (7 - index % 8)
        self.values[index // 8] &= mask
        self.values[index // 8] |= value
    def __getitem__(self, index):
        mask = 1 << (7 - index % 8)
        return bool(self.values[index // 8] & mask)

    def __len__(self):
        return self.lenght

    def __repr__(self):
        return "<{}>".format(", ".join("{:d}".format(value) for value in self))

As you can see, there is no speed gain in doing so, and you'd need a lot of bits to take any benefit of memory savings with that. This is an example of the above class in use at the interactive prompt:

In [50]: a = BitArray(16)

In [51]: a[0] = 1

In [52]: a[15] = 1

In [53]: a
Out[53]: <1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1>
martineau
  • 119,623
  • 25
  • 170
  • 301
jsbueno
  • 99,910
  • 10
  • 151
  • 209
0

Well, you could make a "bitset" using a list of booleans:

mybitset = [True, False, False, True, False]

I lack context to give you a more appropriate response.

DainDwarf
  • 1,651
  • 10
  • 19
  • That probably depends on the python interpreter used. – DainDwarf Dec 28 '15 at 16:07
  • `sys.getsizeof(mybitset)` returns 104 on my machine. Think of it as a pointer to a singleton, because the `id` of every True and False is the same: `>>> [id(x) for x in mybitset] [139974218568224, 139974218568256, 139974218568256, 139974218568224, 139974218568256] `. – Tomasz Gandor Nov 28 '18 at 12:28
0

You could use an ordinary list; however this would not be very effective memorywise: on 32-bit Python builds it would waste 4 bytes per "bit", and on 64-bit builds 8 bytes. This is because the elements of a list are really (references to) other Python objects.

Python standard library does also have the built-in array module, which is much more efficient for storing homogeneous values than the generic list, but it unfortunately does not support bits as the data type. Furthermore, it does not provide the Set interface.

Thus, if memory efficiency is of concern, then your choices would boil down to either building your own Python bitset implementation over the array, or installing a 3rd-party module from PyPI, such as the intbitset