2

I have a bytearray like this:

a=bytearray([False]*size)

I want to store the False value not as a byte but a bit in the byte array; I care about the memory. Is there anyway I can do that? As using 8 bits to store 1 bit value seems like a waste. Also I cannot use bitarray import. I have spent few hours on this now so any help would be much appreciated.

ThePyGuy
  • 17,779
  • 5
  • 18
  • 45
alex ale
  • 31
  • 5
  • Each element of a bytearray is an integer in the range 0 - 255 (0x00 - 0xff). So you can pack 8 bits per element. You'll probably have to manage the indexing yourself - just right shift the bit index by 3 to get the byte index, and use shifts and bit masks to access the desired bit. – Tom Karzes May 22 '21 at 23:51
  • could you show me an example of how to do this? because i want to set every bit in the byte array to false and then based on some hash i will set them to true. So what you said answers my question but could you show me how to do it in code, so i don't waste soo much memory? – alex ale May 23 '21 at 00:03
  • If you really care about the memory usage this much, you should be using a better data structure than a `bytearray` (eg. something from [this question](https://stackoverflow.com/questions/142812/does-python-have-a-bitfield-type)). However, keep in mind that "premature optimization is the root of all evil". Do you really need to optimize this? – SuperStormer May 23 '21 at 00:04
  • if there is a data structure better than bytearray in "standard python library" that can do what i explained above then let me know. – alex ale May 23 '21 at 00:06
  • The stdlib doesn't have anything like this builtin AFAIK because it is a pretty rare case. If you really need this, you should be using a dedicated library or write your own class. – SuperStormer May 23 '21 at 00:07

1 Answers1

3

Here's a simple class that will do what you want. It implements bit arrays on top of the bytearray type:

class BitArray:
    def __init__(self, size):
        self.size = size
        self.bytearray = bytearray((size + 7) >> 3)

    def clear(self):
        ba = self.bytearray
        for i in range(len(ba)):
            ba[i] = 0

    def get_bit(self, bit_ix):
        if not isinstance(bit_ix, int):
            raise IndexError("bit array index not an int")

        if bit_ix < 0 or bit_ix >= self.size:
            raise IndexError("bit array index out of range")

        byte_ix = bit_ix >> 3
        bit_ix = bit_ix & 7
        return (self.bytearray[byte_ix] >> bit_ix) & 1

    def set_bit(self, bit_ix, val):
        if not isinstance(bit_ix, int):
            raise IndexError("bit array index not an int")

        if bit_ix < 0 or bit_ix >= self.size:
            raise IndexError("bit array index out of range")

        if not isinstance(val, int):
            raise ValueError("bit array value not an int")

        if val not in (0, 1):
            raise ValueError("bit array value must be 0 or 1")

        byte_ix = bit_ix >> 3
        bit_ix = bit_ix & 7
        bit_val = 1 << bit_ix

        if val:
            self.bytearray[byte_ix] |= bit_val
        else:
            self.bytearray[byte_ix] &= ~bit_val

    def __getitem__(self, key):
        return self.get_bit(key)

    def __setitem__(self, key, value):
        self.set_bit(key, value)

The bit array is initially all zero upon creation. The clear method will set all bits to zero, the get_bit method will return the value of a bit, and the set_bit method will set the value of a bit. All bit values are either 0 or 1 (although you could easily change it to have the values be True or False).

Here's an example of how to use it:

ba = BitArray(23)
print(ba.get_bit(17))

ba.set_bit(17, 1)
print(ba.get_bit(17))

ba.clear()
print(ba.get_bit(17))

This creates a bit array large enough to hold 23 bits. It then prints bit 17, which is 0. It then sets bit 17 to 1 and prints it. It then clears the entire array and again prints bit 17. The output is:

0
1
0

Update: I added __getitem__ and __setitem__, so you can now use index notation to directly index individual bits. Example:

ba[17] = 1
print(ba[17])

This sets bit 17 to 1, then prints it. The output is:

1

Note that the indices must be integers. Slices are not supported.

Tom Karzes
  • 22,815
  • 2
  • 22
  • 41