0

I have 2 Python lists of integers. The lists are possibly different sizes. One is a list of indices of all the maxima in a dataset, and the other is a list of indices of all the minima. I want to make a list of consecutive maxes and mins in order, and skipping cases where, say, 2 mins come between 2 maxes.

Speed matters most, so I'm asking how the following can done most quickly (using Numpy, I assume, a la this answer): What numpy code can make up some_function() below to do this calculation?

>>> min_idx = [1,5,7]
>>> max_idx = [2,4,6,8]
>>> some_function(min_idx, max_idx)
[1, 2, 5, 6, 7, 8]

In the above example, we looked to see which *_idx list started with the lower value and chose it to be "first" (min_idx). From there, we hop back and forth between min_idx and max_idx to pic "the next biggest number":

  1. Start with 1 from min_idx
  2. Look at max_idx to find the first unused number which is larger than 1: 2
  3. Go back to min_idx to find the first unused number which is larger than 2: 5
  4. Again for max_idx: we skip 4 because it's less than 5 and chose 6
  5. continue process until we run out of values in either list.

As another example, for min_idx = [1,3,5,7,21] and max_idx = [4,6,8,50], the expected result is [1,4,5,6,7,8,21,50]

My current non-Numpy solution looks like this where idx is the output:

# Ensure we use alternating mins and maxes
idx = []
max_bookmark = 0
if min_idx[0] < max_idx[0]:
    first_idx = min_idx
    second_idx = max_idx
else:
    first_idx = max_idx
    second_idx = min_idx
for i, v in enumerate(first_idx):
    if not idx:
        # We just started, so put our 1st value in idx
        idx.append(v)
    elif v > idx[-1]:
        idx.append(v)
    else:
        # Go on to next value in first_idx until we're bigger than the last (max) value
        continue

    # We just added a value from first_idx, so now look for one from second_idx
    for j, k in enumerate(second_idx[max_bookmark:]):
        if k > v:
            idx.append(k)
            max_bookmark += j + 1
            break

Unlike other answers about merging Numpy arrays, the difficulty here is comparing element values as one hops between the two lists along the way.

Background: Min/Max List

The 2 input lists to my problem above are generated by scipy.argrelextrema which has to be used twice: once to get indices of maxima and again to get indices of minima. I ultimately just want a single list of indices of alternating maxes and mins, so if there's some scipy or numpy function which can find maxes and mins of a dataset, and return a list of indices indicating alternating maxes and mins, that would solve what I'm looking for too.

hamx0r
  • 4,081
  • 1
  • 33
  • 46
  • 1
    It's not clear why `[1, 2, 5, 8, 7, 8]` is supposed to be the output for the example input. `8` repeats twice, among other apparent problems. – user2357112 Jul 05 '18 at 23:22
  • I corrected a typo in my example and explained the examples significance with a bulleted list following the example. – hamx0r Jul 07 '18 at 21:31

1 Answers1

0

Here is a much simpler logic without using Numpy (note: this assumes that max(min_idx) < max(max_idx):

min_idx = [1,3,5,7,21]
max_idx = [4,6,8,50]
res = []

for i in min_idx:
    if not res or i > res[-1]:
        pair = min([m for m in max_idx if m > i])
        res.extend([i, pair])

print(res)
>>> [1, 4, 5, 6, 7, 8, 21, 50]
iacob
  • 20,084
  • 6
  • 92
  • 119
  • This works well so long as gaps between values aren't too large. I'm still curious if there's a numpy-based way which would be faster. Each list will really be ~10 elements long spaced 10 or so apart like `[11, 25, 32, 46, 60, 71, 79, 85, 91]` – hamx0r Jul 07 '18 at 21:34
  • The 2nd line of the look requires a min_idx & max_idx value to be consecutive, and the input data is not like that. For `min_idx = [1,3,5,7,21]` and `max_idx = [4,6,8,50]`, the expected result is `[1,4,5,6,7,8,21,50]`, but your code gives `[3,4,5,6,7,8]`. – hamx0r Jul 09 '18 at 15:56
  • @hamx0r ah, I understand now - would you mind editing your question to include this more robust example? – iacob Jul 09 '18 at 16:16
  • 1
    This approach works, @ukemi, though I'm still curious if there's a straightforward numpy solution since numpy tends to be faster with number crunching than raw python. – hamx0r Jul 11 '18 at 03:14