12

I have integers in the range 0..2**m - 1 and I would like to convert them to binary numpy arrays of length m. For example, say m = 4. Now 15 = 1111 in binary and so the output should be (1,1,1,1). 2 = 10 in binary and so the output should be (0,0,1,0). If m were 3 then 2 should be converted to (0,1,0).

I tried np.unpackbits(np.uint8(num)) but that doesn't give an array of the right length. For example,

np.unpackbits(np.uint8(15))
Out[5]: array([0, 0, 0, 0, 1, 1, 1, 1], dtype=uint8)

I would like a method that worked for whatever m I have in the code.

Simd
  • 19,447
  • 42
  • 136
  • 271
  • Should `m` be inferred from the numbers in the array, or specified as an argument? – amaurea Mar 06 '14 at 14:47
  • @amaurea Specified as an argument. – Simd Mar 06 '14 at 15:02
  • 1
    For anyone who found this on google trying to unpackbits for uint8's in-place in an array, I tried vectorizing and axis but neither worked. The solution is actually a lot simpler: `np.unpackbits(a).reshape(*a.shape,8)` – evn Feb 06 '22 at 02:00

5 Answers5

16

You should be able to vectorize this, something like

>>> d = np.array([1,2,3,4,5])
>>> m = 8
>>> (((d[:,None] & (1 << np.arange(m)))) > 0).astype(int)
array([[1, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [1, 0, 1, 0, 0, 0, 0, 0]])

which just gets the appropriate bit weights and then takes the bitwise and:

>>> (1 << np.arange(m))
array([  1,   2,   4,   8,  16,  32,  64, 128])
>>> d[:,None] & (1 << np.arange(m))
array([[1, 0, 0, 0, 0, 0, 0, 0],
       [0, 2, 0, 0, 0, 0, 0, 0],
       [1, 2, 0, 0, 0, 0, 0, 0],
       [0, 0, 4, 0, 0, 0, 0, 0],
       [1, 0, 4, 0, 0, 0, 0, 0]])

There are lots of ways to convert this to 1s wherever it's non-zero (> 0)*1, .astype(bool).astype(int), etc. I chose one basically at random.

DSM
  • 342,061
  • 65
  • 592
  • 494
10

One-line version, taking advantage of the fast path in numpy.binary_repr:

def bin_array(num, m):
    """Convert a positive integer num into an m-bit bit vector"""
    return np.array(list(np.binary_repr(num).zfill(m))).astype(np.int8)

Example:

In [1]: bin_array(15, 6)
Out[1]: array([0, 0, 1, 1, 1, 1], dtype=int8)

Vectorized version for expanding an entire numpy array of ints at once:

def vec_bin_array(arr, m):
    """
    Arguments: 
    arr: Numpy array of positive integers
    m: Number of bits of each integer to retain

    Returns a copy of arr with every element replaced with a bit vector.
    Bits encoded as int8's.
    """
    to_str_func = np.vectorize(lambda x: np.binary_repr(x).zfill(m))
    strs = to_str_func(arr)
    ret = np.zeros(list(arr.shape) + [m], dtype=np.int8)
    for bit_ix in range(0, m):
        fetch_bit_func = np.vectorize(lambda x: x[bit_ix] == '1')
        ret[...,bit_ix] = fetch_bit_func(strs).astype("int8")

    return ret 

Example:

In [1]: vec_bin_array(np.array([[100, 42], [2, 5]]), 8)

Out[1]: array([[[0, 1, 1, 0, 0, 1, 0, 0],
                [0, 0, 1, 0, 1, 0, 1, 0]],

               [[0, 0, 0, 0, 0, 0, 1, 0],
                [0, 0, 0, 0, 0, 1, 0, 1]]], dtype=int8)
Fred Reiss
  • 101
  • 1
  • 4
0

Here's a somewhat 'hacky' solution.

def bin_array(num, m):
    """Returns an array representing the binary representation of num in m bits."""
    bytes = int(math.ceil(m / 8.0))
    num_arr = np.arange(num, num+1, dtype='>i%d' %(bytes))
    return np.unpackbits(num_arr.view(np.uint8))[-1*m:]     
Jayanth Koushik
  • 9,476
  • 1
  • 44
  • 52
0

Seems like you could just modify the resulting array. I don't know the function exactly, but most implementations like np.unpackbits would not inherently know the size of the number - python ints can be arbitrarily large, after all, and don't have a native size.

However, if you know m, you can easily 'fix' the array. Basically, an unpack function will give you some number of bits (that is a multiple of 8) for the byte with the highest 1 in the number. You just need to remove extra 0s, or prepend 0s, to get the right distance:

m = 4
mval = np.unpackbits(np.uint8(15))

if len(mval) > m:
   mval = mval[m-len(mval):]
elif m > len(mval):
   # Create an extra array, and extend it
   mval = numpy.concatenate([numpy.array([0]*(m-len(mval)), dtype=uint8), mval])
Corley Brigman
  • 11,633
  • 5
  • 33
  • 40
0
m=24
tobin = np.vectorize(lambda x: np.array(list(np.binary_repr(x, m)), dtype=int), signature="()->({}})".format(m))

array = np.array([[10,10],[20,20]])
print(tobin(array))

"""
[[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0]
  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0]]

 [[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0]
  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0]]]
shape: (2, 2, 24)
"""



  • 2
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Tyler2P Oct 14 '21 at 16:08