8

I need to speed up the following code:

for i in range(0, 2**N):

    output[i] = f(np.array(map(int, bin(i)[2:].zfill(N))))

N is around 30, so the code is very slow (it takes about 33 hours on my laptop). The argument of the function f() is the binary representation of the index i, and f() can be an arbitrary vectorizable function. I'm not an expert, but in order to speed up the code, I was thinking to get rid of the for loop, which means that I need to vectorize the argument of f(). In other words, I have to create a matrix with the binary representations of the numbers from 0 to 2**N. This can be achieved through the following code:

list(itertools.product([0, 1], repeat=N))

that I have found at this link. However, it seems to me that itertools is very slow, and clearly it takes a lot of memory since 2**30 is about one billion.

Do you have any suggestion for making this code faster? Thanks in advance.

Community
  • 1
  • 1
user2983638
  • 903
  • 1
  • 13
  • 20
  • You seem to be throwing away all but the last value of `output`. – user2357112 Oct 31 '16 at 17:32
  • 8
    Instead of producing a billion element array, why not rewrite as a generator? – dawg Oct 31 '16 at 17:33
  • Look up generators and the `yield` command. – oliversm Oct 31 '16 at 17:34
  • @ user2357112: Hops, you're right. I've fixed it! – user2983638 Oct 31 '16 at 17:34
  • Possible also try the more python syntax `['do something' for i in ...]` – oliversm Oct 31 '16 at 17:35
  • If `f` is computationally expensive there might not be much that you can do. If it takes e.g. a 10th of a millisecond for a single function evaluation, and you are doing over a billion such evaluations, you are looking at around 30 hours no matter what. Perhaps you could refactor so that the computation that `f` does is implemented directly in the loop. At least that would save the overhead of a billion function calls. – John Coleman Oct 31 '16 at 17:48
  • 2
    Also if you are using Python 2.7, by all means use "xrange" , not "range" - that alone would create 2 ^30 objects in your system memory. – jsbueno Oct 31 '16 at 17:55
  • 1
    Are you actually using all billion values? Can you just generate each value as it is called and [then store that value if it is called again](https://en.wikipedia.org/wiki/Memoization)? – dawg Oct 31 '16 at 17:58
  • I didn't know about generators, so I'm going to try also this option. Unfortunately I think memoization cannot be applied to my case, but thanks for the suggestion! – user2983638 Oct 31 '16 at 18:42

1 Answers1

6

Always profile:

>>> timeit.timeit("for i in range(0, 2**N): numpy.array(map(int, bin(i)[2:].zfill(N)))", "import numpy; N=5", number=100000)
26.472519159317017
>>> timeit.timeit("for t in itertools.product((0, 1), repeat=N): numpy.array(t)", "import numpy, itertools; N=5", number=100000)
6.129688024520874

You can see that the itertools.product method is considerably faster, since it doesn't has to fiddle around with strings.

The problem could be that most of the time is spent in the f function though.

Another solution could be to make f accept an integer, and use it as a binary field.

Francisco
  • 10,918
  • 6
  • 34
  • 45
  • Additionally: Time for 2^24 `pass` operations on a recentish MBP: 883ms with just `range`, 501ms with `xrange`, 1.92s with a function call to `pass` and `range`, and 1.58s for function call + `xrange`. – a p Oct 31 '16 at 18:12
  • Cool, I tried with `N=20` and it takes 1 min rather than 1.5 min. This will save a lot of time for larger `N`. I guess 22 hours vs 33 for `N=30`. – user2983638 Oct 31 '16 at 18:36