0

I was trying to locate the index of a value on a numpy array (same for hundreds of them) using np.where. Its not that slow but its a bottleneck so i began experimenting with fortran and f2py and wrote this simple routine.

subroutine find_index(array1, num_elements, target_value, loc)
    real, intent(in) :: array1(:)
    integer, intent(in) :: num_elements, target_value
    integer, intent(out) :: loc
    do i = 1, num_elements
        if (array1(i) .eq. target_value) then
            loc = i
            exit
        endif
    end do
end subroutine

But still no improvement (same as np.where). So i guess its about the method. any sugestions on improving the code (python or fortran)?

EDIT the values i am searching for are integers in an array of integers

George Pamfilis
  • 1,397
  • 2
  • 19
  • 37
  • something like y = minloc(abs(array1 - target_value)) and checking whether or not array(y) == target_value. Note that comparisons with floating point values should be done with a small delta! – albert Mar 10 '18 at 12:14
  • Why do ypu think cslling Fortran would be any faster? Surely numpy routines are implemented in C or Fortran and thoroughly optimized. – Vladimir F Героям слава Mar 10 '18 at 12:17
  • also i clarified in the answer that the values are integers i am searching for – George Pamfilis Mar 10 '18 at 12:18
  • array1 is still of type real. – albert Mar 10 '18 at 12:20
  • You didn't say which Fortran compiler you intend; ifort should be able to vectorize the code you show. That should be significantly faster than minloc() or even findloc() (an f2008 intrinsic). As Vladimir said, you should be able to find numpy implementations which do the same. If num_elements exceeds 100000, you should be able to gain by splitting the array to search in parallel, then finishing by combining the results. If you are doing several hundred small arrays, likewise you should be able to gain by calling your function in multiple threads. – tim18 Mar 10 '18 at 12:51
  • gfortran on ubuntu linux. – George Pamfilis Mar 10 '18 at 13:15
  • 1
    PLEASE DO NOT SCREAM WITH CAPITAL LETTERS! Variable `array1` in the code you show is declared `real` no matter how big letters you use. – Vladimir F Героям слава Mar 10 '18 at 13:23
  • Since you are searching for the 1st, it should be faster. `where` will iterate 3 times to find all. – hpaulj Mar 10 '18 at 14:17
  • the array input are integers. ill change up the real part. sorry for the screams :) – George Pamfilis Mar 10 '18 at 15:34
  • Are your arrays sorted? How long are each of your arrays? Will each array only be searched once? Are the searches and results independent from each other? – Matt P Mar 12 '18 at 16:13
  • in some cases yes while in others no. – George Pamfilis Mar 13 '18 at 13:30

1 Answers1

1

It's been sometime since I worked with fortran and f2py, but I did something similar with cython last year.

In a Hungarian algorithm search problem, I needed to find the first 0 value in a 2d array, subject to row and column masking arrays.

So using where (argwhere is just np.transpose(np.where(...)), the function was:

def find_a_zero(self):
    # find first uncovered 0 cost
    rc = np.argwhere((self.cost + self.rc[:,None] + self.cc) == 0)
    if rc.shape[0]>0:
        return tuple(rc[0])
    else:
        return None, None

I got a good speedup with argmax:

def find_a_zero(self):
    # big help time wise, 16->10 for n=200
    cond = (self.cost + self.rc[:,None] + self.cc) == 0
    if np.count_nonzero(cond):
        idx = np.unravel_index(np.argmax(cond), cond.shape)
        return idx
    return None, None

np.where uses count_nonzero to determine the size of its return arrays. argmax, when operating on a boolean, short circuits on the fist True.

I got even better speed with this cython version:

cdef find_a_zero(int[:,:] cost, int[:] row_cover, int[:] col_cover):
    n = cost.shape[0]
    m = cost.shape[1]
    cdef size_t r, c
    for r in range(n):
        for c in range(m):
            if (cost[r,c]==0) and (row_cover[r]==0) and (col_cover[c]==0):
                row = r
                col = c
                return r, c
    return -1, -1

If my memory of cython is correct, definitions like int[:,:] cost invoke its typed memoryview. which has efficient lowlevel array operations.

http://cython.readthedocs.io/en/latest/src/userguide/memoryviews.html

http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html

hpaulj
  • 221,503
  • 14
  • 230
  • 353