3

I have a list that looks like this:

a = [[[0.0125, 6.6], [0.0125, 6.65], [0.0125, 6.7], [0.0125, 6.75], [0.0125, 6.8]], [[0.0185, 6.6], [0.0185, 6.65], [0.0185, 6.7], [0.0185, 6.75], [0.0185, 6.8]]]

ie: N sub-lists (only two here) and M sub-sub-lists in each sub-list (five in this example). Each element/sub-sub-list is made of two floats.

I need to find the index of a given element, say [0.0185, 6.75]. In this case the result should be: [1, 3].

I can't just apply the .index() operator on a since the element is inside one of the sub-lists and since I don't know a priori which one it is I can't loop through the sub-lists applying that operator because it will result in an error if the element is not found.


Add

I tried the answers by zhangxaochen ans DSM in a much larger array (16 sub-lists and 70 sub-sub-lists) to see which one was faster and this is what I got:

DSM: 4.31537628174e-05
zhangxaochen: 0.00113296508789

Since DSM's answer its ~26x faster, I'm selecting that one. Thanks guys!

Gabriel
  • 40,504
  • 73
  • 230
  • 404
  • When you write "only two here", does that mean that in reality, there will be more than two? – Tim Pietzcker Jan 21 '14 at 16:15
  • @Gabriel How do you expect the indices to be returned, for the nested lists? – thefourtheye Jan 21 '14 at 16:35
  • @thefourtheye not more _nested_ sub-lists but just more of them (ie: instead of 2 sub-lists, 40; instead of 5 sub-sub-lists, 60) Perhaps I expressed myself poorly, sorry. – Gabriel Jan 21 '14 at 16:48

4 Answers4

2

One way would be to use next and enumerate:

>>> a = [[[0.0125, 6.6], [0.0125, 6.65], [0.0125, 6.7], [0.0125, 6.75], [0.0125, 6.8]], [[0.0185, 6.6], [0.0185, 6.65], [0.0185, 6.7], [0.0185, 6.75], [0.0185, 6.8]]]
>>> search_for = [0.0185, 6.75]
>>> print next(((i,j) for i,x in enumerate(a) for j,y in enumerate(x) 
...             if y == search_for), None)
(1, 3)
>>> search_for = [0.0185, 99]
>>> print next(((i,j) for i,x in enumerate(a) for j,y in enumerate(x) 
...             if y == search_for), None)
None

But since testing equality of floats can be too sensitive, you might want to replace y == search_for with an is_close(y, search_for) function which allows some tolerance for error. Methods using is in or .index can't really handle that.

DSM
  • 342,061
  • 65
  • 592
  • 494
  • Unfortunately, the OP just commented that the lists can be nested more deeply. – Tim Pietzcker Jan 21 '14 at 16:26
  • No, the lists won't be nested more deeply. I said there can be _more_ sub-lists and sub-sub-lists (end even elements inside those sub-sub-lists) but not nested inside each other, just their number can increase. – Gabriel Jan 21 '14 at 16:47
2

I'd like to use numpy to do this:

In [93]: from numpy import *
    ...: a = [[[0.0125, 6.6], [0.0125, 6.65], [0.0125, 6.7], [0.0125, 6.75], [0.0125, 6.8]], [[0.0185, 6.6], [0.0185, 6.65], [0.0185, 6.7], [0.0185, 6.75], [0.0185, 6.8]]]
    ...: a=np.asarray(a)
    ...: needle=[0.0185, 6.75]
    ...: idx=nonzero(all(a==needle, axis=-1))
    ...: asarray(idx)[:,0]
    ...: 
Out[93]: array([1, 3])

I refered to these posts:

Python/NumPy first occurrence of subarray

https://github.com/numpy/numpy/issues/2269

In this way it could handle deeply nested cases, e.g. a=[[[[your data...],[...]]]] is 4 level nested, the expected output index is (0,1,3) now:

In [95]: from numpy import *
    ...: a = [[[[0.0125, 6.6], [0.0125, 6.65], [0.0125, 6.7], [0.0125, 6.75], [0.0125, 6.8]], [[0.0185, 6.6], [0.0185, 6.65], [0.0185, 6.7], [0.0185, 6.75], [0.0185, 6.8]]]]
    ...: a=np.asarray(a)
    ...: needle=[0.0185, 6.75]
    ...: idx=nonzero(all(a==needle, axis=-1))
    ...: asarray(idx)[:,0]
Out[95]: array([0, 1, 3])
Community
  • 1
  • 1
zhangxaochen
  • 32,744
  • 15
  • 77
  • 108
1

Using next and a generator expression:

search = [0.0185, 6.75]

gen = ((ix,iy) for ix,outer in enumerate(a) for iy,inner in enumerate(outer) if inner == search)

next(gen,'not found')
Out[27]: (1, 3)

If the generator is exhausted without finding a result, next returns its second argument ('not found' in this case, use whatever you'd like to use)

If the nested list comp above is confusing to you, it is syntactically equivalent to:

for ix,outer in enumerate(a):
    for iy,inner in enumerate(outer):
        if inner == search:
            yield (ix,iy)
roippi
  • 25,533
  • 4
  • 48
  • 73
0

Test for membership using in before you call .index().

def find(lst, needle):
    for i, sublist in enumerate(lst):
        if needle in sublist:
            return [i, sublist.index(needle)]

a = [[[0.0125, 6.6], [0.0125, 6.65], [0.0125, 6.7], [0.0125, 6.75], [0.0125, 6.8]], [[0.0185, 6.6], [0.0185, 6.65], [0.0185, 6.7], [0.0185, 6.75], [0.0185, 6.8]]]
element = [0.0185, 6.75]

print(find(a, element))

Result:

[1, 3]
senshin
  • 10,022
  • 7
  • 46
  • 59
  • `in` and `index`, they both are `O(N)` – thefourtheye Jan 21 '14 at 16:07
  • @thefourtheye That's true... is there a faster way? If the data isn't sorted or otherwise structured, you have to read every element until you find the one you're looking for, don't you? – senshin Jan 21 '14 at 16:09
  • @thefourtheye But they are both built-in methods, which means they iterate over the list much faster than a python loop. However it's true that `senshin` could have just used `sublist.find(needle)` and avoid the extra iteration. Or `sublist.index(needle)` inside a `try` block. – Bakuriu Jan 21 '14 at 16:22