0

Is there a way to convert back and forth between a binary vector and a 128-bit number? I have the following binary vector:

import numpy as np

bits = np.array([1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
                 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
                 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0,
                 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1,
                 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1,
                 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1], dtype=np.uint8)

which is a MD5 hash that I am trying use as a feature for a scikit-learn machine learning classifier (I need to represent the hash as a single feature).

mozway
  • 194,879
  • 13
  • 39
  • 75
Kevin
  • 3,096
  • 2
  • 8
  • 37
  • Does this answer your question? [Convert binary (0|1) numpy to integer or binary-string?](https://stackoverflow.com/questions/41069825/convert-binary-01-numpy-to-integer-or-binary-string) – mozway May 14 '22 at 14:09
  • @mozway I tried that solution but it only handle numbers up to 64 bits. – Kevin May 14 '22 at 14:09
  • Maybe [this answer](https://stackoverflow.com/a/47515534/14923227) will work, but I am not sure how to convert the number back to a binary vector. – Kevin May 14 '22 at 14:14
  • 1
    `numpy` (and `sklearn`) have a `int64` max integer size. floats can be larger. Python ints can also be longer, but can only be put in arrays as objects. – hpaulj May 14 '22 at 15:09
  • @hpaulj I believe `sklearn` uses `float32` as its dtype according to their documentation: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier.fit ? So if I try to use a `int64` or `float64` it will get truncated right? – Kevin May 18 '22 at 18:37

1 Answers1

0

As commented above, numpy only goes up to 64bits, but python has variable length ints, so we can do 128bits int no problem.

The following will go from binary in np.array to python int back to binary in np.array.

import numpy as np

bits = np.array(
    [
        1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
        0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
        1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0,
        1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1,
        0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1,
        1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1
    ],
    dtype=np.uint8
)

s = "".join(bits.astype("str"))  # move from array to string
n = int(s, 2)  # convert to int from string of base 2
print(n)

s = bin(n)[2:]  # get binary of int, cut "0b" prefix
np.array(list(s), dtype=np.uint8)  # put back in np.array

ljmc
  • 4,830
  • 2
  • 7
  • 26