Size of the largest subsequence of values with consecutive alternative sign?

Question

Is there a short way to detect the longest sublist with alternative signs within a list?

For instance:

my_list = [-1, -0.5, 1, -3, 4, 5, 5, -1]

returning 4 starting from -0.5 to 4?

This is what I have written so far but I feel there is room for something much shorter.

import numpy

my_list = [-1, -0.5, 1, -3, 4, 5, 5, -1]

# function that detects whether a list has alternate signs
# https://stackoverflow.com/questions/6451514/detect-alternating-signs
def is_alternating_signs(a):
    return numpy.all(numpy.abs(numpy.diff(numpy.sign(a))) == 2)

# getting all sublists from the main list
sublists = []
for i in range(len(my_list) + 1):
      for j in range(i + 1, len(my_list) + 1):
          sublists.append(my_list[i:j])

# detecting the longest sublist with alternate signs
max_list = 0
for sublist in sublists:
      if is_alternating_signs(sublist) and len(sublist) > max_list:
          max_list = len(sublist)
print(max_list)

So, is your expected output only the length or also say the starting index/value? — norok2, Oct 22 '21 at 13:45
I think, instead of trying all sublists, just iterate from the beginning of the list and keep track of how long it is alternating. This should make it at least faster. — T C Molenaar, Oct 22 '21 at 13:46

Corralien · Answer 1 · 2021-10-22T15:16:06.640

4

Use zip to compare the current element with the next one:

maxlen = 1
curlen = 1
for i, j in zip(l, l[1:]):

    # if one conditions match
    # increment curlen by 1
    if (i < 0 and j > 0) or (i > 0 and j < 0):
        curlen += 1

    # break the alternative sign
    # keep the highest value between maxlen and curlen
    # reset curlen to 1
    else:
        maxlen = max(maxlen, curlen)
        curlen = 1
maxlen = max(maxlen, curlen)

Output:

>>> maxlen
4

edited Oct 22 '21 at 15:16

answered Oct 22 '21 at 13:52

Corralien

109,409
8
28
52

1

If `0` were considered positive or did not occur, one could get away with `if (i < 0) != (j < 0)`. – user2390182 Oct 22 '21 at 14:02
Question - what the output from this `lst = [-1, -5, 1, -3, 4, 5, 5, -1, 9, -4, 5, -6, 7]` It not clear to me from this post. Please confirm. Thanks. – Daniel Hao Oct 22 '21 at 14:33
1

@DanielHao. I fixed the bug. I have to compare maxlen and curlen at the end of the loop too. Thanks. The answer is now 7 – Corralien Oct 22 '21 at 15:16
@user2390182. I let the OP clarify this point. – Corralien Oct 22 '21 at 15:18
OP here. The answer would be 7. – Nicolas Berthier Oct 24 '21 at 07:35
@NicolasBerthier. Does my answer solve your problem? – Corralien Oct 24 '21 at 07:36
Yes, thanks. I will use @alain-t solution going forward as it is slightly shorter. – Nicolas Berthier Oct 24 '21 at 10:56
Ok no problem but this is not optimal because you have to execute 2 loops instead one. – Corralien Oct 24 '21 at 11:08

norok2 · Answer 2 · 2021-10-24T16:19:02.733

What about a single loop?

def max_alt_subseq_size(seq):
    last_x = seq[0]
    size = max_size = 1
    for x in seq[1:]:
        # use the fact that x * y < 0 iff x > 0 and y < 0 or x < 0 and y > 0
        if last_x * x < 0: 
            size += 1 
        else:
            # once the size of the alternating subsequence is found, we need to check if it is the largest
            if size > max_size: 
                max_size = size
            size = 1 
        last_x = x
    # check on the final subsequence to see if it is the largest
    if size > max_size: 
        max_size = size
    return max_size


my_list = [-1, -0.5, 1, -3, 4, 5, 5, -1]
max_alt_subseq_size(my_list)
# 4

Alain T. · Accepted Answer · 2021-10-22T18:23:44.863

1

You can use zip to detect the positions of 'breaks' in the alternance. Then combine these breaks into ranges to find the longest streak of alternating values:

L = [-1, -0.5, 1, -3, 4, 5, 5, -1]

breaks  = [i for i,(a,b) in enumerate(zip(L,L[1:]),1) if (a<0)==(b<0)]
longest = max((L[s:e] for s,e in zip([0]+breaks,breaks+[None])),key=len)

print(longest)
[-0.5, 1, -3, 4]

If you're only looking for the length of the streak, you could convert the zip result to a string of 1s and 0s, then split on 0s and measure the longest substring:

max(map(len,"".join("01"[a*b<0] for a,b in zip(L,L[1:])).split('0')))+1 

4

edited Oct 22 '21 at 18:23

answered Oct 22 '21 at 16:48

Alain T.

40,517
4
31
51

For any late-comer, keep in mind that this solution is both time and memory inefficient, compared to a single loop. – norok2 Oct 24 '21 at 13:33

norok2 · Answer 4 · 2021-10-25T06:45:17.460

One could have a (number of) fully vectorized approach.

The code below assumes a NumPy 1D array as input.

For example, if one computes the run-length encoding (RLE) in a vectorized fashion, it would be simple to use RLE information on some array that represents where the signs change to compute the desired value

import numpy as np


def rle(arr):
    n = len(arr)
    if n == 0:
        values = np.empty(0, dtype=arr.dtype)
        lengths = np.empty(0, dtype=np.int_)
    else:
        positions = np.concatenate(
            [[-1], np.nonzero(arr[1:] != arr[:-1])[0], [n - 1]])
        lengths = positions[1:] - positions[:-1]
        values = arr[positions[1:]]
    return values, lengths


def max_alt_rle(arr):
    values, lengths = rle(arr[1:] * arr[:-1] < 0)
    subs_lengths = lengths[values]
    return (1 if len(arr) > 0 else 0) + \
        (np.max(subs_lengths) if len(subs_lengths) > 0 else 0)

Alternatively, one could make good use of the richer functionalities available to Strings/Bytes, notably str.split() to craft a very short, vectorized, but not very efficient solution:

def max_alt_np(arr):
    return (1 if len(arr) > 0 else 0) + \
        len(max((arr[1:] * arr[:-1] < 0).tobytes().split(b'\x00')))

If one is after raw speed, accelerating with Numba the single loop solution would be most efficient and fast solution:

import numba as nb


@nb.jit
def max_alt_nb(arr):
    if len(arr):
        last_x = arr[0]
        size = max_size = 1
        for x in arr[1:]:
            if last_x * x < 0: 
                size += 1 
            else:
                if size > max_size: 
                    max_size = size
                size = 1 
            last_x = x
        if size > max_size: 
            max_size = size
        return max_size
    else:
        return 0

Finally, here is reported an adaptation of the currently accepted answer, which is neither efficient nor fast, but it is relatively compact (but not as compact as max_alt_np and considerably slower) and can use lists without prior conversion to a NumPy array:

def max_alt_str(arr):
    return (1 if len(arr) > 0 else 0) + len(max(
        ("".join(
            "01"[1 if a * b < 0 else 0]
            for a, b in zip(arr[:-1], arr[1:])))
        .split("0")))

Here some benchmarks on random integer arrays of varying size:

(Full analysis here).

Size of the largest subsequence of values with consecutive alternative sign?

4 Answers4