Fastest one-to-one mapping of binary 1d Numpy array to integer

Question

What's the fastest way of converting a 1d Numpy array containing only 0s and 1s to a unique integer?

The best I came up with so far is to use Cython and see the array as a mirrored binary number*.

@cython.boundscheck(False)
@cython.wraparound(False)
def _map_binary(np.ndarray[np.int64_t, ndim=1] x):

    cdef int tot = 0
    cdef int i
    cdef int n = x.shape[0]

    for i in xrange(n):
        if x[i]:
            tot += 2**i
    return tot

As this is still the bottleneck of my algorithm, I am wondering if there's a smarter and faster way to do it.

* Obviously this mapping is one-to-one only for arrays with the same length (as appending zeros to the array doesn't change the resulting integer), but this is fine for my purposes.

Not sure of it's speed but you could try `int("".join([str(c) for c in x]),2)` — Nick is tired, Jan 12 '17 at 16:19
http://stackoverflow.com/questions/41069825/convert-binary-01-numpy-to-integer-or-binary-string — Ari Gold, Jan 12 '17 at 16:19
You don't need to do `** 2` on each iteration, you could keep a variable and multiply it by 2 on each iteration, I don't know if it would make a noticeable difference though. — Francisco, Jan 12 '17 at 16:20
Second suggestion would be to specify the array as contiguous (`mode=c`), provided that it is, of course. — DavidW, Jan 12 '17 at 16:44
Base on @AriGold's link, the fastest one on NumPy seems to be `np.dot(x,1 << np.arange(x.size))`. — Divakar, Jan 12 '17 at 17:08
@NickA, everything having to do with strings is quite slow, same goes for Numpy methods, still in the microseconds range @Divakar. Regarding @DavidW 's and @FranciscoCouzo 's suggestions, I suppose there might be marginal improvements but I think I need a more precise timing approach then the easy `%timeit` to really see if they are there. Suggestions for more accurate timing are appreciated. — Pietro Marchesi, Jan 12 '17 at 21:25

DavidW · Answer 1 · 2017-01-13T19:15:39.500

Collating some of the ideas from the comments and some of my own

A few pure Python (+numpy) options:

import numpy as np

def _map_binary_str_and_back(x):
    # from comment by @NickA
    return int("".join([str(c) for c in x]),2)

def _map_binary_np_dot(x):
    # from question http://stackoverflow.com/questions/41069825/convert-binary-01-numpy-to-integer-or-binary-string
    return np.dot(x,1 << np.arange(x.size))

def _map_binary_np_pack(x):
    # uses built in numpy function packbits, but unfortunately needs a bit of manipulation
    # afterwards
    x = np.packbits(x) # as np.int8
    x.resize((8,),refcheck=False)
    return x.view(dtype=np.int64)

Some Cython options (note that I've changed the output to a 64-bit integer so that it works with arrays up to 64 elements long):

cimport cython
cimport numpy as np

def _map_binary_original(np.ndarray[np.int64_t, ndim=1] x):

    cdef np.uint64_t tot = 0
    cdef np.uint64_t i
    cdef int n = x.shape[0]

    for i in xrange(n):
        if x[i]:
            tot += 2**i
    return tot

@cython.boundscheck(False)
@cython.wraparound(False)
def _map_binary_contig(np.ndarray[np.int64_t, ndim=1, mode="c"] x):

    cdef np.uint64_t tot = 0
    cdef np.uint64_t i
    cdef int n = x.shape[0]

    for i in xrange(n):
        if x[i]:
            tot += 2**i
    return tot

@cython.boundscheck(False)
@cython.wraparound(False)    
def _map_binary_shift(np.ndarray[np.int64_t, ndim=1, mode="c"] x):

    cdef np.uint64_t tot = 0
    cdef np.uint64_t i
    cdef int n = x.shape[0]

    for i in xrange(n):
        if x[i]:
            tot += 1<<i
    return tot

@cython.boundscheck(False)
@cython.wraparound(False)    
def _map_binary_times2(np.ndarray[np.int64_t, ndim=1, mode="c"] x):
    # @FranciscoCouzo

    cdef np.uint64_t tot = 0
    cdef np.uint64_t i
    cdef int n = x.shape[0]

    for i in xrange(n):
        tot *= 2
        if x[i]:            
            tot +=1 
    return tot

@cython.boundscheck(False)
@cython.wraparound(False)    
def _map_binary_times2_as_shift(np.ndarray[np.int64_t, ndim=1, mode="c"] x):

    cdef np.uint64_t tot = 0
    cdef np.uint64_t i
    cdef int n = x.shape[0]

    for i in xrange(n):
        tot *= 2
        if x[i]:            
            tot +=1 
    return tot

And (for reference) some timing code

from map_binary import (_map_binary_original,_map_binary_contig,
                        _map_binary_shift,_map_binary_times2,
                        _map_binary_times2_as_shift)

test_array = np.random.randint(2,size=(60,)).astype(dtype=np.int64)

def time_function(name):
    from timeit import timeit

    num = 10000
    timed = timeit("f(x)","from __main__ import {} as f, test_array as x".format(name),number=num)/num
    print(name, timed)

for n in list(globals().keys()):
    if n.startswith('_map_binary'):
        time_function(n)

The results (slightly reformatted for clarity):

_map_binary_str_and_back    9.774386967484043e-05
_map_binary_np_dot          7.402434574531678e-06
_map_binary_np_pack         1.5813756692768855e-06
_map_binary_original        7.462656716457738e-07
_map_binary_contig          7.208434833198e-07
_map_binary_shift           5.84043665719558e-07
_map_binary_times2          6.467991376011505e-07
_map_binary_times2_as_shift 6.412435894529889e-07

In summary:

Of the "no Cython" versions using np.packbits is the quickest option by some margin (but obviously worse than the Cython versions). However the no-Cython versions need further work to ensure they give the same answer (I think dot is suffering from an integer overflow. packbits flips the endianness, so gives a valid but different answer)
Specifying the contiguous-ness of the array makes things marginally quicker.
Shift seems the fastest, followed by multiplying followed by power (shift is only the fastest if you use unsigned integers).

Mazdak · Answer 2 · 2017-01-12T16:53:27.433

0

I'm not sure if your solution is the best way at an algorithmic perspective but for the sake of writing more optimized Cython codes I'd suggest following changes:

Using memory-view arrays in Cython
Using non-pythonic codes when you are using Cython, so that Cython can generate smaller C codes from your python code. So instead of using shape and indexing since you have a 1D array you can use size() and instead of using xrange and indexing just loop over the array and increase the variable i in each iteration (or at least just use range()), that's because xrange is a generator and requires more work to be converted to C.
Using C libraries like pow
Predefine the type your functions

form ibc.math cimport pow

@cython.boundscheck(False)
@cython.wraparound(False)
cdef int _map_binary(np.int32_t[:] x):
    cdef int tot = 0
    cdef int i = 0
    cdef int n = x.size
    cdef int item
    for item in x:
        if item:
            tot = tot + pow(2, i)
        i = i + 1
    return tot

edited Jan 12 '17 at 16:53

answered Jan 12 '17 at 16:45

Mazdak

105,000
18
159
188

Cython actually converts `xrange` and `range` to a simple C loop, so they actually are the best way of doing things. I think (but could be wrong) that `for item in x` will the use the Python iterator interface and be comparatively slow. – DavidW Jan 12 '17 at 16:50
@DavidW Agreed, but I'm not sure too. The only thing that I'm sure is that `xragnge` is slower than the rest. – Mazdak Jan 12 '17 at 16:55
@Kasramvd, thank you for your answer. I don't think using `cdef` is an option for me as this function needs to be available to Python code, at least for now. Secondly, `pow(2,i)` appears to return a double, which means you can't add it to `tot`. Overall, it's much slower (worse than my pure Python version). You are right though that I can get a marginal speedup by turning `xrange` into `range`. – Pietro Marchesi Jan 12 '17 at 20:35
@PietroMarchesi Regarding `cdef` you should call this function in your cython code with another python function (which has been defined by `def`). And regarding `pow()` you're right it returns double, and in that case, you can return a double too, or if you don't want to do this using simple python power would be good too as it's not a big deal here. – Mazdak Jan 12 '17 at 20:41
But after all, I recommend you to change your algorithm instead of the implementation, then if you needed more performance you can turn that to a Cython code. – Mazdak Jan 12 '17 at 20:42

Fastest one-to-one mapping of binary 1d Numpy array to integer

2 Answers2