Slicing a list into sublists based on condition

Question

I want to slice this list of numbers:

num_list = [97, 122, 99, 98, 111, 112, 113, 100, 102]

into multiple sublists. The condition for slicing is that the numbers in each sublist should be in increasing order.

So the final result will look like this:

 list_1 = [97, 122]
 list_2 = [99]
 list_3 = [98, 111, 112, 113]
 list_4 = [100, 102]

Can anyone help me to solve this problem please? Thanks a lot

score 9 · Accepted Answer · answered Sep 28 '18 at 08:44

I've quickly written one way to do this, I'm sure there are more efficient ways, but this works at least:

num_list =[97, 122, 99, 98, 111, 112, 113, 100, 102]

arrays = [[num_list[0]]] # array of sub-arrays (starts with first value)
for i in range(1, len(num_list)): # go through each element after the first
    if num_list[i - 1] < num_list[i]: # If it's larger than the previous
        arrays[len(arrays) - 1].append(num_list[i]) # Add it to the last sub-array
    else: # otherwise
        arrays.append([num_list[i]]) # Make a new sub-array 
print(arrays)

Hopefully this helps you a bit :)

sir do you know how to do this with strings instead of numbers because this won't help with numbers — Dinindu Theekshana, Sep 23 '21 at 19:51

jpp · Answer 2 · 2018-09-28T14:22:41.933

Creating a variable number of variables is not recommended. Use a list of lists or dictionary instead. Here's an example with dict and a generator function:

from itertools import islice, zip_longest

def yield_lists(L):
    x = []
    for i, j in zip_longest(L, islice(L, 1, None), fillvalue=L[-1]):
        x.append(i)
        if i > j:
            yield x
            x = []
    yield x

num_list = [97, 122, 99, 98, 111, 112, 113, 100, 102]

res = dict(enumerate(yield_lists(num_list), 1))

Resut:

{1: [97, 122],
 2: [99],
 3: [98, 111, 112, 113],
 4: [100, 102]}

For example, access the second list via res[2].

Ideal solution, generators for the win – Take_Care_ Sep 28 '18 at 09:01 — Take_Care_, Sep 28 '18 at 09:01

Mazdak · Answer 3 · 2018-09-28T09:52:29.067

6

Here is a one-linear Numpythonic approach:

np.split(arr, np.where(np.diff(arr) < 0)[0] + 1)

Or a similar approach to numpy code but less efficient:

from operator import sub
from itertools import starmap
indices = [0] + [
                  i+1 for i, j in enumerate(list(
                        starmap(sub, zip(num_list[1:], num_list)))
                    ) if j < 0] + [len(num_list)
                ] + [len(num_list)]

result = [num_list[i:j] for i, j in zip(indices, indices[1:])]

Demo:

# Numpy
In [8]: np.split(num_list, np.where(np.diff(num_list) < 0)[0] + 1)
Out[8]: 
[array([ 97, 122]),
 array([99]),
 array([ 98, 111, 112, 113]),
 array([100, 102])]

# Python
In [42]: from operator import sub

In [43]: from itertools import starmap

In [44]: indices = [0] + [i+1 for i, j in enumerate(list(starmap(sub, zip(num_list[1:], num_list)))) if j < 0] + [len(num_list)]

In [45]: [num_list[i:j] for i, j in zip(indices, indices[1:])]
Out[45]: [[97, 122], [99], [98, 111, 112, 113], [100, 102]]

Explanation:

Using np.diff() you can get the differences of each item with their next item (up until the last element). Then you can use the vectorized nature of numpy to get the indices of the places where this difference is negative, which can be done with a simple comparison and np.where(). Finally you can simply pass the indices to np.split() to split the array based on those indices.

edited Sep 28 '18 at 09:52

answered Sep 28 '18 at 08:57

Mazdak

105,000
18
159
188

Out of curiosity, what would be the last element of the `np.diff(num_list)` array? Because the last element has no next item. – Guimoute Sep 28 '18 at 09:02
@Guimoute There's no diff for last element. Just added that to the answer. – Mazdak Sep 28 '18 at 09:03
I like this approach, it's very appealing! But, could you please elaborate on the [0] + 1 part? I can't quite figure out how that part works. – storluffarn Feb 12 '20 at 10:53
1

@storluffarn `[0]` chooses the first index of the `np.where` result because it tends to return a list that it's length is equal to number of indices the array that you've passed to the function (here it's the result of `np.diff(num_list) < 0`). The `+1` is because how `split` works. We add `+1` to split right after the indices of where the diff condition is True. – Mazdak Feb 12 '20 at 11:06
Much appreciated! – storluffarn Feb 13 '20 at 13:14

score 2 · Answer 4 · answered Sep 28 '18 at 09:28

All nice solutions here. Maybe this one will be easier to understand for some ppl?

def increasing(a, b):
    return a < b

def seq_split(lst, cond):
    sublst = [lst[0]]
    for item in lst[1:]:
        if cond(sublst[-1], item):
            sublst.append(item)
        else:
            yield sublst
            sublst = [item]
    if sublst:
        yield sublst

list(seq_split(num_list, increasing))

Slicing a list into sublists based on condition

4 Answers4