50

I have a long list of float numbers ranging from 1 to 5, called "average", and I want to return the list of indices for elements that are smaller than a or larger than b

def find(lst,a,b):
    result = []
    for x in lst:
        if x<a or x>b:
            i = lst.index(x)
            result.append(i)
    return result

matches = find(average,2,4)

But surprisingly, the output for "matches" has a lot of repetitions in it, e.g. [2, 2, 10, 2, 2, 2, 19, 2, 10, 2, 2, 42, 2, 2, 10, 2, 2, 2, 10, 2, 2, ...].

Why is this happening?

Henrik Andersson
  • 45,354
  • 16
  • 98
  • 92
Logan Yang
  • 2,364
  • 6
  • 27
  • 43
  • 1
    Possible duplicate of [How to find all occurrences of an element in a list?](https://stackoverflow.com/questions/6294179/how-to-find-all-occurrences-of-an-element-in-a-list) – Qiu Jul 10 '17 at 11:33

3 Answers3

81

You are using .index() which will only find the first occurrence of your value in the list. So if you have a value 1.0 at index 2, and at index 9, then .index(1.0) will always return 2, no matter how many times 1.0 occurs in the list.

Use enumerate() to add indices to your loop instead:

def find(lst, a, b):
    result = []
    for i, x in enumerate(lst):
        if x<a or x>b:
            result.append(i)
    return result

You can collapse this into a list comprehension:

def find(lst, a, b):
    return [i for i, x in enumerate(lst) if x<a or x>b]
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Now I totally get it. The list comprehension is really a good one, I'm still trying to adapt to this kind of compact form in Python. Your answer is excellent, thanks so much! – Logan Yang May 22 '13 at 07:17
  • What's funny is that the wrong result with repetitions seems working fine for my later use, since I want to use it to extract columns of a large matrix. It seems repetitions do not affect the slicing. – Logan Yang May 22 '13 at 07:22
  • 1
    You'll still get the correct values out of your list, the same values resides at index 2 and whatever later indices. But it's a bug waiting to happen, biting you at some other point in your code. – Martijn Pieters May 22 '13 at 07:24
3

if you're doing a lot of this kind of thing you should consider using numpy.

In [56]: import random, numpy

In [57]: lst = numpy.array([random.uniform(0, 5) for _ in range(1000)]) # example list

In [58]: a, b = 1, 3

In [59]: numpy.flatnonzero((lst > a) & (lst < b))[:10]
Out[59]: array([ 0, 12, 13, 15, 18, 19, 23, 24, 26, 29])

In response to Seanny123's question, I used this timing code:

import numpy, timeit, random

a, b = 1, 3

lst = numpy.array([random.uniform(0, 5) for _ in range(1000)])

def numpy_way():
    numpy.flatnonzero((lst > 1) & (lst < 3))[:10]

def list_comprehension():
    [e for e in lst if 1 < e < 3][:10]

print timeit.timeit(numpy_way)
print timeit.timeit(list_comprehension)

The numpy version is over 60 times faster.

mirekphd
  • 4,799
  • 3
  • 38
  • 59
Alex Coventry
  • 68,681
  • 4
  • 36
  • 40
  • What's the performance comparison compared to just doing a list comprehension? Also, why use `numpy.flatnonzero` over `numpy.where`? – Seanny123 Mar 09 '17 at 11:28
  • 1
    It's over 60 times faster in my hands. `flatnonzero` is simpler than `where`, here; you don't need to pull the array of indices out of the tuple. – Alex Coventry Mar 10 '17 at 02:06
-1
>>> average =  [1,3,2,1,1,0,24,23,7,2,727,2,7,68,7,83,2]
>>> matches = [i for i in range(0,len(average)) if average[i]<2 or average[i]>4]
>>> matches
[0, 3, 4, 5, 6, 7, 8, 10, 12, 13, 14, 15]
Sheng
  • 3,467
  • 1
  • 17
  • 21