3

I have list/array of integers, call a subarray a peak if it goes up and then goes down. For example:

[5,5,4,5,4]

contains

[4,5,4]

which is a peak.

Also consider

[6,5,4,4,4,4,4,5,6,7,7,7,7,7,6]

which contains

[6,7,7,7,7,7,6]

which is a peak.

The problem

Given an input list, I would like to find all the peaks contained in it of minimal length and report them. In the example above, [5,6,7,7,7,7,7,6] is also a peak but we remove the first element and it remains a peak so we don't report it.

So for input list:

L = [5,5,5,5,4,5,4,5,6,7,8,8,8,8,8,9,9,8]

we would return

[4,5,4] and [8,9,9,8] only.

I am having problems devising a nice algorithm for this. Any help would be hugely appreciated.

Community
  • 1
  • 1
Simd
  • 19,447
  • 42
  • 136
  • 271

1 Answers1

4

Using itertools

Here is a short solution using itertools.groupby to detect peaks. The groups identifying peaks are then unpacked to yield the actual sequence.

from itertools import groupby, islice

l = [1, 2, 1, 2, 2, 0, 0]

fst, mid, nxt = groupby(l), islice(groupby(l), 1, None), islice(groupby(l), 2, None)
peaks = [[f[0], *m[1], n[0]] for f, m, n in zip(fst, mid, nxt) if f[0] < m[0] > n[0]]

print(peaks)

Output

[[1, 2, 1], [1, 2, 2, 0]]

Using a loop (faster)

The above solution is elegant but since three instances of groupby are created, the list is traversed three times.

Here is a solution using a single traversal.

def peaks(lst):
    first = 0
    last = 1
    while last < len(lst) - 1:
        if lst[first] < lst[last] == lst[last+1]:
            last += 1
        elif lst[first] < lst[last] > lst[last+1]:
            yield lst[first:last+2]
            first = last + 1
            last += 2
        else:
            first = last
            last += 1

l = [1, 2, 1, 2, 2, 0, 0]
print(list(peaks(l)))

Output

[[1, 2, 1], [1, 2, 2, 0]]

Notes on benchmark

Upon benchmarking with timeit, I noticed an increase in performance of about 20% for the solution using a loop. For short lists the overhead of groupby could bring that number up to 40%. The benchmark was done on Python 3.6.

Olivier Melançon
  • 21,584
  • 4
  • 41
  • 73
  • The function seems to be missing a return statement? – Simd Feb 13 '19 at 16:39
  • 1
    @Anush It's a generator, it [yields](https://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do) the peaks. This is a good practice for when lists can be fairly long. And anyway, a generator can always be made into a list with a cast. – Olivier Melançon Feb 13 '19 at 16:50
  • Thank you. Your answers are great. – Simd Feb 14 '19 at 06:05