3

The arrays I'm checking are boolean. In this case, np.count_nonzero() appears to be the most efficient way to do a "sum". I still wonder if there's a way to make it faster, probably by doing the "greater than" check simultaneously to the counting!

Here's a toy example in which I time my approach (I'm guessing the way I use "timeit" and average over 100 trials is quite stupid but whatever) using a large array rather than plenty of small ones, and then doing the same thing on a smaller array to demonstrate how much faster it "should" be:

from timeit import time
import numpy as np
hugeflatarray=np.ones(100000000, dtype=bool)
smallflatarray=np.ones(10, dtype=bool)
smallvalue=1

mytimes=[]
for i in range(100):
    t1=time.clock()
    np.count_nonzero(hugeflatarray)>smallvalue
    t2=time.clock()
    mytimes.append(t2-t1)
print("average time for huge array:"+str(np.mean(mytimes)))

mytimes=[]
for i in range(100):
    t1=time.clock()
    np.count_nonzero(smallflatarray)>smallvalue
    t2=time.clock()
    mytimes.append(t2-t1)
print("average time for small array:"+str(np.mean(mytimes)))

average time for huge array:0.0111809413765

average time for small array:9.83558325865e-07

np.count_nonzero() probably works by going through the whole array and cumulating the values as it goes, right? Wouldn't it be faster if there was a way to stop as soon as "smallvalue" is reached? A "short-circuit" of sorts.

edit:

@user2357112 After reading your advice I've tried a numba solution, and it DOES appear to be slightly faster than count_nonzero(hugearray)>smallvalue! Thank you. Here's my solution: @numba.jit(numba.boolean(numba.boolean[:],numba.int64)) def jitcountgreaterthan(hugearray,smallvalue): a=numba.int64(0) for i in hugearray: a+=i if a==smallvalue: break return a==smallvalue

I did this weird "break, THEN return" because numba apparently doesn't support return statements in a for loop, but in practice it doesn't seem to affect anything.

Community
  • 1
  • 1
440hertz
  • 39
  • 5
  • 2
    You'd probably need Numba or Cython for that. – user2357112 Mar 27 '17 at 17:42
  • Compiled `numpy` code doesn't provide that option. My guess is that a custom function would have to break early on to work any faster (on average). – hpaulj Mar 27 '17 at 18:30
  • You might find this discussion about finding the first `nan` enlightening: http://stackoverflow.com/questions/41320568/what-is-the-most-efficient-way-to-find-the-position-of-the-first-np-nan-value – hpaulj Mar 27 '17 at 20:10

0 Answers0