3

So I posted a question before, but it was too simplified and rightly got flagged as a duplicate. I'm now posting my problem in more detail so my issue might, hopefully, be resolved. Briefly it is as follows: I have two lists: a = [10.0,20.0,25.0,40.0] and b = [1.0,10.0,15.0,20.0,30.0,100.0]

Using list comprehension, I want to exclude from b the ranges of elements specified in a. That is: remove from b all elements between 10.0 and 20.0, and between 25.0 and 40.0. Here is what I tried:

kk = 0
while kk < len(a):
    up_lim = a[kk] #upper limit
    dwn_lim = a[kk+1] #lower limit
    x = [b[y] for y in range(len(b)) if (b[y]<dwn_lim or b[y]>up_lim)] #This line produces correct result if done outside of a while loop. Somehow fails in while loop.
    b = list(x) #update the old list with the new&reduced list
    kk += 2 #update counter

I'm expecting the result x = [1.0,100.0], but I get x = [1.0,10.0,15.0,20.0,30.0,100.0]

In fact, the key line with the list comprehension works if I do it outside the while loop (of course this is useless because list 'a' could be arbitrary in size which is why I used a while loop).

So the question is: how and why does a while loop stop the list comprehension from happening correctly?

cs95
  • 379,657
  • 97
  • 704
  • 746
Edwin
  • 55
  • 6
  • Really, you should look at numpy or pandas for this. – cs95 Jan 15 '18 at 05:39
  • I did find a flaw in my code example. up_lim and dwn_lim should be swapped and this should work. I've accepted the answer below however. Thanks, lads. – Edwin Jan 15 '18 at 06:06

1 Answers1

4

With vanilla python, you can generalise using any/all. I'm going with any here.

>>> [x for x in b if not any(i <= x <= j for i, j in zip(a[::2], a[1::2]))]
[1.0, 100.0]

This zips every alternate pair of list items with zip, and one by one check to ensure that x is not in any of them.

If you're interested in a performance, consider a pandas approach. You can build an Intervalindex, right for the task. Searching is logarithmic, and very fast.

>>> import pandas as pd
>>> idx = pd.IntervalIndex.from_arrays(a[::2], a[1::2], closed='both')
>>> [x for x, y in zip(b, idx.get_indexer(b)) if y == -1]
[1.0, 100.0]
cs95
  • 379,657
  • 97
  • 704
  • 746
  • Thanks, this works. I also found a flaw in my code: if i exchange up_lim and dwn_lim then it should all work out just fine. But I learned something new with your answer. – Edwin Jan 15 '18 at 06:05
  • 1
    @Edwin Cheers, I appreciate you accepting the closure of your other question, and the all round sportsmanship. Please continue to contribute to SO with your questions (and maybe, one day, your answers as well). – cs95 Jan 15 '18 at 06:06