0

I am trying to figure out which is the best way to parallelize the execution of a single operation for each cell in a 2D numpy array.

In particular, I need to do a bitwise operation for each cell in the array.

This is what I do using a single for cycle:

for x in range(M):
    for y in range(N):
        v[x][y] = (v[x][y] >> 7) & 255

I found a way to do the same above using the vectorize method:

def f(x):
    return (x >> 7) & 255
f = numpy.vectorize(f)

v = f(v)

However, using vectorize doesn't seem to improve performance.

I read about numexpr in this answer on StackOverflow, where also Theano and Cython are cited. Theano in particular seems a good solution, but I cannot find examples that fit my case.

So my question is: which is the best way to improve the above code, using parallelization and possibly GPU computation? May someone post some sample code to do this?

Community
  • 1
  • 1
Vito Gentile
  • 13,336
  • 9
  • 61
  • 96
  • You could look at `multiprocessing.Pool`, by defining a function for your bitwise operation and sending the list of all your cells. It will then use all your processors to evaluate the result, but you wiil need to reconstruct the array, wich can make you lose time. How big is your array? how long does your calculation take? – CoMartel May 18 '15 at 11:58
  • With such a simple element-wise transform, the overall performance may be memory IO bound. I.e., moving data to/from a GPU may make the operation slower overall. – rickhg12hs May 18 '15 at 12:17
  • Correct the indentation and variable names in the 1st code piece. – hpaulj May 18 '15 at 16:05
  • @hpaulj, I've just done, thanks – Vito Gentile May 18 '15 at 16:07
  • The normal way to index an 2d array is `v[x,y]`. Using `[x][y]` may be equivalent here, but if `v[x]` produces a copy rather than a view, it does not work. – hpaulj May 18 '15 at 16:58

1 Answers1

4

I am not familiar with bitwise operations but this here gives me the same result as your code and is vectorized.

import numpy as np

# make sure it is a numpy.array
v = np.array(v)

# vectorized computation
N = (v >> 7) & 255
plonser
  • 3,323
  • 2
  • 18
  • 22