3

I have an array that I obtain from using sp.distance.cdist, and such array looks as follows:

 [ 0.          5.37060126  2.68530063  4.65107712  2.68530063  4.65107712
   2.04846297  7.41906423  4.11190697  6.50622284  4.11190697  6.50622284]
 [ 5.37060126  0.          4.65107712  2.68530063  4.65107712  2.68530063
   7.41906423  2.04846297  6.50622284  4.11190697  6.50622284  4.11190697]
 [ 2.68530063  4.65107712  0.          2.68530063  4.65107712  5.37060126
   4.11190697  6.50622284  2.04846297  4.11190697  6.50622284  7.41906423]
 [ 4.65107712  2.68530063  2.68530063  0.          5.37060126  4.65107712
   6.50622284  4.11190697  4.11190697  2.04846297  7.41906423  6.50622284]
 [ 2.68530063  4.65107712  4.65107712  5.37060126  0.          2.68530063
   4.11190697  6.50622284  6.50622284  7.41906423  2.04846297  4.11190697]
 [ 4.65107712  2.68530063  5.37060126  4.65107712  2.68530063  0.
   6.50622284  4.11190697  7.41906423  6.50622284  4.11190697  2.04846297]
 [ 2.04846297  7.41906423  4.11190697  6.50622284  4.11190697  6.50622284
   0.          9.4675272   4.7337636   8.19911907  4.7337636   8.19911907]
 [ 7.41906423  2.04846297  6.50622284  4.11190697  6.50622284  4.11190697
   9.4675272   0.          8.19911907  4.7337636   8.19911907  4.7337636 ]
 [ 4.11190697  6.50622284  2.04846297  4.11190697  6.50622284  7.41906423
   4.7337636   8.19911907  0.          4.7337636   8.19911907  9.4675272 ]
 [ 6.50622284  4.11190697  4.11190697  2.04846297  7.41906423  6.50622284
   8.19911907  4.7337636   4.7337636   0.          9.4675272   8.19911907]
 [ 4.11190697  6.50622284  6.50622284  7.41906423  2.04846297  4.11190697
   4.7337636   8.19911907  8.19911907  9.4675272   0.          4.7337636 ]
 [ 6.50622284  4.11190697  7.41906423  6.50622284  4.11190697  2.04846297
   8.19911907  4.7337636   9.4675272   8.19911907  4.7337636   0.        ]]

What I'm trying to do, using numpy, is to search some values, for example between 2.7 and 2.3, and at the same time I'd also like to return the indices when they are found in the rows of the arrays. I have read a lot, and I have found for example .argmin(), which does partially what I want (but it only shows you where the zeros or values lower than zero are located, and just one concurrence). In the documentation of .argmin I cannot find anything related on how find the minimum different from zero and that it doesn't stop after the first concurrence. I need to do it for these values in the interval. To explain myself better, this is what I expect to get:

e.g.:

[row (0), index (2), index (4)]
[row (1), index (3), index (5)]
[row (2), index (0), index (3)]

What would be the best way to do this? In the meantime, I'll keep trying and if I find a solution I'll post it here.

Thanks.

muammar
  • 951
  • 2
  • 13
  • 32
  • What do you want to use the array indices for? You are probably better of simply using the result of `(v > 2.3) & (v < 2.7)` (a boolean array) instead of an array of indices. – Sven Marnach Nov 20 '13 at 14:32

2 Answers2

2

What you looking for is the np.argwhere function, which tells you index-wise where a condition in an array is satisfied.

v = np.array([[ 0.     ,     5.37060126,  2.68530063 , 4.65107712 , 2.5 ],
              [ 5.37060126 ,  4.65107712 , 2.68530063 ,.11190697,1 ]])


np.argwhere((v > 2.3) & (v < 2.7))

array([[0, 2],
        [0, 4],
         [1, 2]])
Acorbe
  • 8,367
  • 5
  • 37
  • 66
1

What you need is numpy.where, which returns a tuple containing the indices of each dimension where some condition is True for the values of an numpy.ndarray. Example using your data:

i, j = np.where(((a > 2.3) & (a < 2.7)))
#(array([ 0,  0,  2,  2,  4,  4,  6,  6,  8,  8, 10, 10], dtype=int64),
# array([2, 4, 3, 5, 0, 3, 1, 2, 0, 5, 1, 4], dtype=int64))

Then you can use groupby to put the output in the format that you want:

from itertools import groupby
for k,g in itertools.groupby(zip(i, j), lambda x: x[0]):
    print k, [tmp[1] for tmp in zip(*g)]
#0 [0, 4]
#2 [2, 5]
#4 [4, 3]
#6 [6, 2]
#8 [8, 5]
#10 [10, 4]
Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
  • I tried to use your solution, but I always get errors. I was also reading on [how to use groupby](http://stackoverflow.com/questions/773/how-do-i-use-pythons-itertools-groupby) but nothing so far. `for k,g in groupby(zip(i,j), lambda x: x[0]):` `NameError: name 'i' is not defined` – muammar Nov 20 '13 at 19:44
  • @muammar `i, j` are the indices returned by `np.where`... I forgot to put in the answer... now I've edited it and it should work! – Saullo G. P. Castro Nov 20 '13 at 20:16
  • Both `argwhere` and `where` use `nonzero`. – hpaulj Nov 21 '13 at 04:45