8

The aim is to find groups of increasing/monotonic numbers given a list of integers. Each item in the resulting group must be of a +1 increment from the previous item

Given an input:

x = [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]

I need to find groups of increasing numbers and achieve:

increasing_numbers = [(7,8,9,10), (0,1,2,3,4,5)]

And eventually also the number of increasing numbers:

len(list(chain(*increasing_numbers)))

And also the len of the groups:

increasing_num_groups_length = [len(i) for i in increasing_numbers]

I have tried the following to get the number of increasing numbers:

>>> from itertools import tee, chain
>>> def pairwise(iterable): 
...     a, b = tee(iterable)
...     next(b, None)
...     return zip(a, b)
... 
>>> x = [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
>>> set(list(chain(*[(i,j) for i,j in pairwise(x) if j-1==i])))
set([1, 2, 3, 4, 5, 6, 8, 9, 10, 11])
>>> len(set(list(chain(*[(i,j) for i,j in pairwise(x) if j-1==i]))))
10

But I'm unable to keep the order and the groups of increasing numbers.

How can I achieve the increasing_numbers groups of integer tuples and also the increasing_num_groups_length?

Also, is there a name for such/similar problem?


EDITED

I've came up with this solution but it's super verbose and I'm sure there's an easier way to achieve the increasing_numbers output:

>>> from itertools import tee, chain
>>> def pairwise(iterable): 
...     a, b = tee(iterable)
...     next(b, None)
...     return zip(a, b)
... 
>>> x = [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
>>> boundary =  iter([0] + [i+1 for i, (j,k) in enumerate(pairwise(x)) if j+1!=k] + [len(x)])
>>> [tuple(x[i:next(boundary)]) for i in boundary]
[(8, 9, 10, 11), (1, 2, 3, 4, 5, 6)]

Is there a more pythonic / less verbose way to do this?


Another input/output example:

[in]:

[17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35, 36, 40]

[out]:

[(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36)]

alvas
  • 115,346
  • 109
  • 446
  • 738
  • It would call your problem: finding intervals on which data is monotonous and increasing. It is actually as simple as finding the places where the data is **not** increasing, and using it as the boundaries for your groups (assuming that the data is never decreasing, but that seemed to be the case from both your description and your input). – Bertrand Caron Oct 28 '15 at 22:20
  • My solution looks like some monster perl script =( Is there someone who knows a pythonic solution to this? BTW, this is not homework, just trying to get part of an algorithm right. And it requires this "monotonic" or increasing sequence counting. – alvas Oct 28 '15 at 22:36
  • When pythonic becomes cryptic, I usually opt for a simple for loop. – RobertB Oct 28 '15 at 22:39
  • @alvas. Your output seems to be wrong, since the second sequence begins `0, 1, 2, 2, ...`. – ekhumoro Oct 29 '15 at 16:17
  • @ekhumoro, whoops, human blindness. Yep, it should start at `4...` instead. – alvas Oct 29 '15 at 16:20
  • 1
    @alvas. This is starting to become pretty funny: now you missed `0, 1, 2` as the second sequence. Maybe you should use one of the answers to check your question is right ;-) – ekhumoro Oct 29 '15 at 16:25
  • Thank you all for the interesting solutions, if it's alright, i've given the bounty to the @ekhumoro answer and for completion pandaric's answer seems to be best =) – alvas Nov 06 '15 at 14:12

9 Answers9

7

EDIT:

Here's a code-golf solution (142 characters):

def f(x):s=[0]+[i for i in range(1,len(x)) if x[i]!=x[i-1]+1]+[len(x)];return [x[j:k] for j,k in [s[i:i+2] for i in range(len(s)-1)] if k-j>1]

Expanded version:

def igroups(x):
    s = [0] + [i for i in range(1, len(x)) if x[i] != x[i-1] + 1] + [len(x)]
    return [x[j:k] for j, k in [s[i:i+2] for i in range(len(s)-1)] if k - j > 1]

Commented version:

def igroups(x):
    # find the boundaries where numbers are not consecutive
    boundaries = [i for i in range(1, len(x)) if x[i] != x[i-1] + 1]
    # add the start and end boundaries
    boundaries = [0] + boundaries + [len(x)]
    # take the boundaries as pairwise slices
    slices = [boundaries[i:i + 2] for i in range(len(boundaries) - 1)]
    # extract all sequences with length greater than one
    return [x[start:end] for start, end in slices if end - start > 1]

Original solution:

Not sure whether this counts as "pythonic" or "not too verbose":

def igroups(iterable):
    items = iter(iterable)
    a, b = None, next(items, None)
    result = [b]
    while b is not None:
        a, b = b, next(items, None)
        if b is not None and a + 1 == b:
            result.append(b)
        else:
            if len(result) > 1:
                yield tuple(result)
            result = [b]

print(list(igroups([])))
print(list(igroups([0, 0, 0])))
print(list(igroups([7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5])))
print(list(igroups([8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6])))
print(list(igroups([9, 1, 2, 3, 1, 1, 2, 3, 5])))

Output:

[]
[]
[(7, 8, 9, 10), (0, 1, 2, 3, 4, 5)]
[(8, 9, 10, 11), (1, 2, 3, 4, 5, 6)]
[(1, 2, 3), (1, 2, 3)]
ekhumoro
  • 115,249
  • 20
  • 229
  • 336
  • Thanks for the answer, it's looks more "readable" for sure. But maybe there's a simpler way. It's not as simple as the people who downvoted this question thought, right? – alvas Oct 29 '15 at 07:56
  • @ekhumoro: Last output, ultimate 5 is missing, isn't it? – P. Ortiz Oct 29 '15 at 10:51
  • 1
    @P.Ortiz. No: I'm assuming from the OPs own solution (`if j+1!=k`) that they only want sequences with no gaps (i.e. always increasing by one). If that's not the case, though, the test can just be changed to `a < b`. – ekhumoro Oct 29 '15 at 15:38
  • @alvas. I'm not exactly sure why your question received so many downvotes. Maybe because the requirements not very well defined, and you don't say what *specific* problems you are having in trying to find a solution. And what *exactly* would count as a "simpler way"? I hope you don't just mean "shorter", because that would be off-topic for SO. If you want shorter, ask on [codegolf](https://codegolf.stackexchange.com/). – ekhumoro Oct 29 '15 at 15:50
  • Surely not shorter. Simply as in easier to understand and hopefully more pythonic, possibly similar to what you have done. The problem is simply to group increasing numbers in a list =) – alvas Oct 29 '15 at 15:53
  • @alvas. Is the problem so simple? Look carefully at the tests in my answer: do they all give the results you expect? When I try them with your own solution, the second gives `[(0,), (0,)]` and the last gives `[(9,), (1,), (5,)]`. To be more specific: should the sequences **only** increase by one (i.e. no gaps), and does a one-item sequence count? – ekhumoro Oct 29 '15 at 16:05
  • The sequence should only increase by +1 – alvas Oct 29 '15 at 16:12
  • @ekhumoro: you are right, I didn't realize that integers has to be *consecutive*. – P. Ortiz Oct 29 '15 at 17:11
  • You can get that down to 135 bytes `def f(x):q=len(x);s=[0]+[i for i in range(1,q)if x[i]!=x[i-1]+1]+[q];return[x[j:k]for j,k in[s[i:i+2]for i in range(len(s)-1)]if k-j>1]` – Morgan Thrapp Nov 05 '15 at 15:56
4

A couple of different ways using itertools and numpy:

from itertools import groupby, tee, cycle

x = [17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35,
     36, 1, 2, 3, 4,34,54]


def sequences(l):
    x2 = cycle(l)
    next(x2)
    grps = groupby(l, key=lambda j: j + 1 == next(x2))
    for k, v in grps:
        if k:
            yield tuple(v) + (next((next(grps)[1])),)


print(list(sequences(x)))

[(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36), (1, 2, 3, 4)]

Or using python3 and yield from:

def sequences(l):
    x2 = cycle(l)
    next(x2)
    grps = groupby(l, key=lambda j: j + 1 == next(x2))
    yield from (tuple(v) + (next((next(grps)[1])),) for k,v in grps if k)

print(list(sequences(x)))

Using a variation of my answer here with numpy.split :

out = [tuple(arr) for arr in np.split(x, np.where(np.diff(x) != 1)[0] + 1) if arr.size > 1]

print(out)

[(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36), (1, 2, 3, 4)]

And similar to ekhumoro's answer:

def sequences(x):
    it = iter(x)
    prev, temp = next(it), []
    while prev is not None:
        start = next(it, None)
        if prev + 1 == start:
            temp.append(prev)
        elif temp:
            yield tuple(temp + [prev])
            temp = []
        prev = start

To get the length and the tuple:

def sequences(l):
    x2 = cycle(l)
    next(x2)
    grps = groupby(l, key=lambda j: j + 1 == next(x2))
    for k, v in grps:
        if k:
            t = tuple(v) + (next(next(grps)[1]),)
            yield t, len(t)


def sequences(l):
    x2 = cycle(l)
    next(x2)
    grps = groupby(l, lambda j: j + 1 == next(x2))
    yield from ((t, len(t)) for t in (tuple(v) + (next(next(grps)[1]),)
                                      for k, v in grps if k))



def sequences(x):
        it = iter(x)
        prev, temp = next(it), []
        while prev is not None:
            start = next(it, None)
            if prev + 1 == start:
                temp.append(prev)
            elif temp:
                yield tuple(temp + [prev]), len(temp) + 1
                temp = []
            prev = start

Output will be the same for all three:

[((19, 20, 21, 22), 4), ((0, 1, 2), 3), ((4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), 11)
, ((28, 29, 30, 31, 32, 33, 34, 35, 36), 9), ((1, 2, 3, 4), 4)]
Community
  • 1
  • 1
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
2

I think the most maintainable solution would be to make it simple:

def group_by(l):
    res = [[l[0]]]
    for i in range(1, len(l)):
        if l[i-1] < l[i]:
            res[-1].append(l[i])
        else:
            res.append([l[i]])
    return res

This solution does not filter out single element sequences, but it can be easily implemented. Additionally, this has O(n) complexity. And you can make it an generator as well if you want.

By maintainable I mean code that is not an one-liner of 300 characters, with some convoluted expressions. Then maybe you would want to use Perl :). At least you will how the function behaves one year later.

>>> x = [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]
>>> print(group_by(x))
[[7, 8, 9, 10], [6], [0, 1, 2, 3, 4, 5]]
crlb
  • 1,282
  • 12
  • 18
1
def igroups(L):
    R=[[]]
    [R[-1].append(L[i]) for i in range(len(L)) if (L[i-1]+1==L[i] if L[i-1]+1==L[i] else R.append([L[i]]))]
    return [P for P in R if len(P)>1]


tests=[[],
    [0, 0, 0],
    [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5],
    [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6],
    [9, 1, 2, 3, 1, 1, 2, 3, 5],
    [4,3,2,1,1,2,3,3,4,3],
    [1, 4, 3],
    [1],
    [1,2],
    [2,1]
    ]
for L in tests:
    print(L)
    print(igroups(L))
    print("-"*10)

outputting the following:

[]
[]
----------
[0, 0, 0]
[]
----------
[7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]
[[7, 8, 9, 10], [0, 1, 2, 3, 4, 5]]
----------
[8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
[[8, 9, 10, 11], [1, 2, 3, 4, 5, 6]]
----------
[9, 1, 2, 3, 1, 1, 2, 3, 5]
[[1, 2, 3], [1, 2, 3]]
----------
[4, 3, 2, 1, 1, 2, 3, 3, 4, 3]
[[1, 2, 3], [3, 4]]
----------
[1, 4, 3]
[]
----------
[1]
[]
----------
[1, 2]
[[1, 2]]
----------
[2, 1]
[]
----------

EDIT My first attemp using itertools.groupby was a fail, sorry for that.

P. Ortiz
  • 174
  • 7
  • 1
    Lolz, understandable but scarily cryptic. Care to explain the answer a little? ;P – alvas Oct 28 '15 at 23:35
  • 1
    As explained by the Python library docs, `groupby` generates a new group every time the value of the lambda function changes (from `True` meaning that the sequence is increasing to `False` meaning the sequence halts increasing). – P. Ortiz Oct 28 '15 at 23:41
  • 4
    @P.Ortiz. Your current code looks like it is broken because it outputs: `[[7, 8, 9], [10, 6], [0, 1, 2, 3, 4]]`. – ekhumoro Oct 28 '15 at 23:50
  • My error and apologies; I was on the wrong question entirely. – Prune Oct 29 '15 at 23:27
1

If two consecutive numbers are increasing by one I form a list (group) of tuples of those numbers.

When non-increasing and if the list (group) is non-empty, I unpack it and zip again to rebuild the pair of sequence which were broken by the zip. I use set comprehension to eliminate duplicate numbers.

  def extract_increasing_groups(seq):
    seq = tuple(seq)

    def is_increasing(a,b):
        return a + 1 == b

    def unzip(seq):
        return tuple(sorted({ y for x in zip(*seq) for y in x}))

    group = []
    for a,b in zip(seq[:-1],seq[1:]):
        if is_increasing(a,b):
            group.append((a,b))
        elif group:
            yield unzip(group)
            group = []

    if group:
        yield unzip(group)

if __name__ == '__main__':

    x = [17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12,

         13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35, 36, 40]

    for group in extract_increasing_groups(x):
        print(group)

Simpler one using set;

from collections import namedtuple
from itertools import islice, tee

def extract_increasing_groups(iterable):

    iter1, iter2 = tee(iterable)
    iter2 = islice(iter2,1,None)

    is_increasing = lambda a,b: a + 1 == b
    Igroup = namedtuple('Igroup','group, len')

    group = set()
    for pair in zip(iter1, iter2):
        if is_increasing(*pair):
            group.update(pair)
        elif group:
            yield Igroup(tuple(sorted(group)),len(group))
            group = set()

    if group:
        yield Igroup(tuple(sorted(group)), len(group))


if __name__ == '__main__':

    x = [17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35, 36, 40]
    total = 0
    for group in extract_increasing_groups(x):
        total += group.len
        print('Group: {}\nLength: {}'.format(group.group, group.len))
    print('Total: {}'.format(total))
Nizam Mohamed
  • 8,751
  • 24
  • 32
0

With itertools.groupby, the problem of partionning a list of integers L in sublists of adjacent and increasing consecutive items from L can be done with a one-liner. Nevertheless I don't know how pythonic it can be considered ;)

Here is the code with some simple tests:

[EDIT : now subsequences are increasing by 1, I missed this point the first time.]

from itertools import groupby

def f(i):
    return  L[i-1]+1==L[i]


def igroups(L):
    return [[L[I[0]-1]]+[L[i] for i in I] for I in [I for (v,I) in [(k,[i for i in list(g)]) for (k, g) in groupby(range(1, len(L)), f)] if v]]

outputting:

tests=[
    [0, 0, 0, 0],
    [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5],
    [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6],
    [9, 1, 2, 3, 1, 1, 2, 3, 5],
    [4,3,2,1,1,2,3,3,4,3],
    [1, 4, 3],
    [1],
    [1,2, 2],
    [2,1],
    [0, 0, 0, 0, 2, 5, 5, 8],
    ]
for L in tests:
    print(L)
    print(igroups(L))
    print('-'*10)


[0, 0, 0, 0]
[]
----------
[7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]
[[7, 8, 9, 10], [0, 1, 2, 3, 4, 5]]
----------
[8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
[[8, 9, 10, 11], [1, 2, 3, 4, 5, 6]]
----------
[9, 1, 2, 3, 1, 1, 2, 3, 5]
[[1, 2, 3], [1, 2, 3]]
----------
[4, 3, 2, 1, 1, 2, 3, 3, 4, 3]
[[1, 2, 3], [3, 4]]
----------
[1, 4, 3]
[]
----------
[1]
[]
----------
[1, 2, 2]
[[1, 2]]
----------
[2, 1]
[]
----------
[0, 0, 0, 0, 2, 5, 5, 8]
[]
----------

Some explanation. If you "unroll" the code, the logic is more apparant :

from itertools import groupby

def f(i):
    return L[i]==L[i-1]+1

def igroups(L):
    monotonic_states = [(k,list(g)) for (k, g) in groupby(range(1, len(L)), f)]
    increasing_items_indices = [I for (v,I) in monotonic_states if v]
    print("\nincreasing_items_indices ->", increasing_items_indices, '\n')
    full_increasing_items= [[L[I[0]-1]]+[L[i] for i in I] for I in increasing_items_indices]
    return full_increasing_items

L= [2, 8, 4, 5, 6, 7, 8, 5, 9, 10, 11, 12, 25, 26, 27, 42, 41]
print(L)
print(igroups(L))

outputting :

[2, 8, 4, 5, 6, 7, 8, 5, 9, 10, 11, 12, 25, 26, 27, 42, 41]

increasing_items_indices -> [[3, 4, 5, 6], [9, 10, 11], [13, 14]]

[[4, 5, 6, 7, 8], [9, 10, 11, 12], [25, 26, 27]]

We need a key function f that compares an item with the preceding one in the given list. Now, the important point is that the groupby function with the key function f provides a tuple (k, S) where S represents adjacent indices from the initial list and where the state of f is constant, the state being given by the value of k: if k is True, then S represents increasing (by 1) items indices else non-increasing items indices. (in fact, as the example above shows, the list S is incomplete and lacks the first item).

I also made some random tests with one million items lists : igroups function returns always the correct response but is 4 times slower than a naive implementation! Simpler is easier and faster ;)

Thanks alvas for your question, it gives me a lot of fun!

P. Ortiz
  • 174
  • 7
  • Great answer! Thanks for unrolling the list comprehensions and explaning it. – alvas Oct 29 '15 at 15:58
  • Is there a reason to use `L[i]>L[i-1]` instead of `L[i] == L[i]+1`? – alvas Oct 29 '15 at 16:10
  • @alvas: I suppose you are talking about the definition of the key function, don't you? But how `L[i] == L[i]+1` is possible ? – P. Ortiz Oct 29 '15 at 16:14
  • Ah i see what you're doing, you're also keeping the increasing but not +1 sequences!! Kool!!! – alvas Oct 29 '15 at 16:17
  • @P.Ortiz. The OP has said in other comments that the sequences should only increase by one. So perhaps they meant `L[i] == L[i-1]+1`? (PS: what is a "naive" implementation?) – ekhumoro Oct 29 '15 at 16:22
  • @P.Ortiz, One issue though, groupby seem to do some sorting which sorts the order of the sequences. – alvas Oct 29 '15 at 16:25
  • @alvas. I don't think so. BTW, at first time, I missed you need **consecutive** increasing sub-sequences. I update my codes. – P. Ortiz Oct 29 '15 at 17:46
  • @ ekhumoro By naive implementation I mean the basic one you write if you are not aware of itertools and even iterator, the one a beginner is going to write. You need only a `for` loop, `if`/`else` conditions, list creation and the `append` method to populate the list with consecutive items. More or less, the implementation you will write with a low level language like C. – P. Ortiz Oct 29 '15 at 18:03
  • @P.Ortiz. I always thought "naive" meant a crude solution that missed most of the edge cases. Or to put it another way: it's the sort of solution most people get if they don't do any testing :-) – ekhumoro Oct 29 '15 at 18:19
0

A (really) simple implementation:

x = [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]
result = []
current = x[0]
temp = []
for i in xrange(1, len(x)):
    if (x[i] - current == 1):
        temp.append( x[i] )
    else:
         if (len(temp) > 1):
             result.append(temp)
         temp = [ x[i] ]
    current = x[i]
result.append(temp)

And you will get [ [7, 8, 9, 10], [0, 1, 2, 3, 4, 5] ]. From there, you can get the number of increasing numbers by [ len(x) for x in result ] and the total number of numbers sum( len(x) for x in result).

Woody1193
  • 7,252
  • 5
  • 40
  • 90
0

I think this works. It's not fancy but it's simple. It constructs a start list sl and an end list el, which should always be the same length, then uses them to index into x:

def igroups(x):
    sl = [i for i in range(len(x)-1)
          if (x == 0 or x[i] != x[i-1]+1) and x[i+1] == x[i]+1]

    el = [i for i in range(1, len(x))
          if x[i] == x[i-1]+1 and (i == len(x)-1 or x[i+1] != x[i]+1)]

    return [x[sl[i]:el[i]+1] for i in range(len(sl))]
Tom Karzes
  • 22,815
  • 2
  • 22
  • 41
0

Late answer, but a simple implementation, generalized to take a predicate so that it need not necessarily be increasing numbers, but could readily be any relationship between the two numbers.

def group_by(lst, predicate):
    result = [[lst[0]]]
    for i, x in enumerate(lst[1:], start=1):
        if not predicate(lst[i - 1], x):
            result.append([x])
        else:
            result[-1].append(x)
    return list(filter(lambda lst: len(lst) > 1, result))

Testing this:

>>> group_by([1,2,3,4, 7, 1, 0, 2], lambda x, y: x < y)
[[1, 2, 3, 4, 7], [0, 2]]
>>> group_by([1,2,3,4, 7, 1, 0, 2], lambda x, y: x > y)
[[7, 1, 0]]
>>> group_by([1,2,3,4, 7, 1, 0, 0, 2], lambda x, y: x < y)
[[1, 2, 3, 4, 7], [0, 2]]
>>> group_by([1,2,3,4, 7, 1, 0, 0, 2], lambda x, y: x > y)
[[7, 1, 0]]
>>> group_by([1,2,3,4, 7, 1, 0, 0, 2], lambda x, y: x >= y)
[[7, 1, 0, 0]]
>>>

And now we can easily specialize this:

>>> def ascending_groups(lst):
...     return group_by(lst, lambda x, y: x < y)
... 
>>> ascending_groups([1,2,3,4, 7, 1, 0, 0, 2])
[[1, 2, 3, 4, 7], [0, 2]]
>>> 
Chris
  • 26,361
  • 5
  • 21
  • 42