3

Is there a short way to detect the longest sublist with alternative signs within a list?

For instance:

my_list = [-1, -0.5, 1, -3, 4, 5, 5, -1]

returning 4 starting from -0.5 to 4?

This is what I have written so far but I feel there is room for something much shorter.

import numpy

my_list = [-1, -0.5, 1, -3, 4, 5, 5, -1]

# function that detects whether a list has alternate signs
# https://stackoverflow.com/questions/6451514/detect-alternating-signs
def is_alternating_signs(a):
    return numpy.all(numpy.abs(numpy.diff(numpy.sign(a))) == 2)

# getting all sublists from the main list
sublists = []
for i in range(len(my_list) + 1):
      for j in range(i + 1, len(my_list) + 1):
          sublists.append(my_list[i:j])

# detecting the longest sublist with alternate signs
max_list = 0
for sublist in sublists:
      if is_alternating_signs(sublist) and len(sublist) > max_list:
          max_list = len(sublist)
print(max_list)
norok2
  • 25,683
  • 4
  • 73
  • 99
Nicolas Berthier
  • 459
  • 1
  • 8
  • 17

4 Answers4

4

Use zip to compare the current element with the next one:

maxlen = 1
curlen = 1
for i, j in zip(l, l[1:]):

    # if one conditions match
    # increment curlen by 1
    if (i < 0 and j > 0) or (i > 0 and j < 0):
        curlen += 1

    # break the alternative sign
    # keep the highest value between maxlen and curlen
    # reset curlen to 1
    else:
        maxlen = max(maxlen, curlen)
        curlen = 1
maxlen = max(maxlen, curlen)

Output:

>>> maxlen
4
Corralien
  • 109,409
  • 8
  • 28
  • 52
1

What about a single loop?

def max_alt_subseq_size(seq):
    last_x = seq[0]
    size = max_size = 1
    for x in seq[1:]:
        # use the fact that x * y < 0 iff x > 0 and y < 0 or x < 0 and y > 0
        if last_x * x < 0: 
            size += 1 
        else:
            # once the size of the alternating subsequence is found, we need to check if it is the largest
            if size > max_size: 
                max_size = size
            size = 1 
        last_x = x
    # check on the final subsequence to see if it is the largest
    if size > max_size: 
        max_size = size
    return max_size


my_list = [-1, -0.5, 1, -3, 4, 5, 5, -1]
max_alt_subseq_size(my_list)
# 4
norok2
  • 25,683
  • 4
  • 73
  • 99
1

You can use zip to detect the positions of 'breaks' in the alternance. Then combine these breaks into ranges to find the longest streak of alternating values:

L = [-1, -0.5, 1, -3, 4, 5, 5, -1]

breaks  = [i for i,(a,b) in enumerate(zip(L,L[1:]),1) if (a<0)==(b<0)]
longest = max((L[s:e] for s,e in zip([0]+breaks,breaks+[None])),key=len)

print(longest)
[-0.5, 1, -3, 4]

If you're only looking for the length of the streak, you could convert the zip result to a string of 1s and 0s, then split on 0s and measure the longest substring:

max(map(len,"".join("01"[a*b<0] for a,b in zip(L,L[1:])).split('0')))+1 

4
Alain T.
  • 40,517
  • 4
  • 31
  • 51
  • For any late-comer, keep in mind that this solution is both time and memory inefficient, compared to a single loop. – norok2 Oct 24 '21 at 13:33
1

One could have a (number of) fully vectorized approach.

The code below assumes a NumPy 1D array as input.

For example, if one computes the run-length encoding (RLE) in a vectorized fashion, it would be simple to use RLE information on some array that represents where the signs change to compute the desired value

import numpy as np


def rle(arr):
    n = len(arr)
    if n == 0:
        values = np.empty(0, dtype=arr.dtype)
        lengths = np.empty(0, dtype=np.int_)
    else:
        positions = np.concatenate(
            [[-1], np.nonzero(arr[1:] != arr[:-1])[0], [n - 1]])
        lengths = positions[1:] - positions[:-1]
        values = arr[positions[1:]]
    return values, lengths


def max_alt_rle(arr):
    values, lengths = rle(arr[1:] * arr[:-1] < 0)
    subs_lengths = lengths[values]
    return (1 if len(arr) > 0 else 0) + \
        (np.max(subs_lengths) if len(subs_lengths) > 0 else 0)

Alternatively, one could make good use of the richer functionalities available to Strings/Bytes, notably str.split() to craft a very short, vectorized, but not very efficient solution:

def max_alt_np(arr):
    return (1 if len(arr) > 0 else 0) + \
        len(max((arr[1:] * arr[:-1] < 0).tobytes().split(b'\x00')))

If one is after raw speed, accelerating with Numba the single loop solution would be most efficient and fast solution:

import numba as nb


@nb.jit
def max_alt_nb(arr):
    if len(arr):
        last_x = arr[0]
        size = max_size = 1
        for x in arr[1:]:
            if last_x * x < 0: 
                size += 1 
            else:
                if size > max_size: 
                    max_size = size
                size = 1 
            last_x = x
        if size > max_size: 
            max_size = size
        return max_size
    else:
        return 0

Finally, here is reported an adaptation of the currently accepted answer, which is neither efficient nor fast, but it is relatively compact (but not as compact as max_alt_np and considerably slower) and can use lists without prior conversion to a NumPy array:

def max_alt_str(arr):
    return (1 if len(arr) > 0 else 0) + len(max(
        ("".join(
            "01"[1 if a * b < 0 else 0]
            for a, b in zip(arr[:-1], arr[1:])))
        .split("0")))

Here some benchmarks on random integer arrays of varying size:

bm_full bm_zoom

(Full analysis here).

norok2
  • 25,683
  • 4
  • 73
  • 99