12

I have a bunch of numbers, say the following:

1 2 3 4  6 7 8  20 24 28 32

The information presented there could be represented in Python as ranges:

[range(1, 5), range(6, 9), range(20, 33, 4)]

In my output I'd write 1..4, 6..8, 20..32..4, but that is just a matter of presentation.

Another answer shows how one can do this for contiguous ranges. I don't see how I can easily do this for strided ranges like above. Is there a similar trick for this?

Community
  • 1
  • 1
Martin Ueding
  • 8,245
  • 6
  • 46
  • 92

4 Answers4

4

Here's a straight forward approach at the problem.

def get_ranges(ls):
    N = len(ls)
    while ls:
        # single element remains, yield the trivial range
        if N == 1:
            yield range(ls[0], ls[0] + 1)
            break

        diff = ls[1] - ls[0]
        # find the last index that satisfies the determined difference
        i = next(i for i in range(1, N) if i + 1 == N or ls[i+1] - ls[i] != diff)

        yield range(ls[0], ls[i] + 1, diff)

        # update variables
        ls = ls[i+1:]
        N -= i + 1
Jared Goguen
  • 8,772
  • 2
  • 18
  • 36
  • get_ranges([1,2,4,5,7,9] gives a range of [7, 9] at the end. – George Sep 19 '17 at 16:55
  • @George what would you expect? The above algorithm will produce [1,2], [4,5], [7,9] as intended because it is greedily filling ranges. If you want an non-greedy algorithm, an entirely different approach would be necessary and there is nothing in this question to suggest that it is. – Jared Goguen Sep 19 '17 at 18:47
  • Ah shoot, I misunderstood the question. Never mind :) – George Sep 19 '17 at 22:25
4
def ranges(data):
    result = []
    if not data:
        return result
    idata = iter(data)
    first = prev = next(idata)
    for following in idata:
        if following - prev == 1:
            prev = following
        else:
            result.append((first, prev + 1))
            first = prev = following
    # There was either exactly 1 element and the loop never ran,
    # or the loop just normally ended and we need to account
    # for the last remaining range.
    result.append((first, prev+1))
    return result

Test:

>>> data = range(1, 5) + range(6, 9) + range(20, 24)
>>> print ranges(data)
[(1, 5), (6, 9), (20, 24)]
9000
  • 39,899
  • 9
  • 66
  • 104
  • @JaredGoguen: As an exercise, add a parameter that sets the step. Auto-detection of the step needs preliminary full scan. – 9000 May 04 '17 at 22:17
  • I don't believe that's true. See the other answers, including mine. – Jared Goguen May 05 '17 at 03:54
  • @JaredGoguen: Your solution assumes that input data start with a sequence longer that 1 element, so the first diff is the step. Consider input `[10, 15, 5, 6, 7, 30, 31, 32]`. The step can be determined e.g. as the smallest diff between elements (requires a full scan), but it leads to problems when ranges touch: `[1, 2, 3, 3, 4, 5]`, or even intersect: `[10, 11, 12, 11, 12, 13]`. Step auto-detection, as any pattern-recognition problem, is not trivial and requires a few assumptions to be clearly stated. – 9000 May 05 '17 at 15:01
1

You can use groupby and count from itertools module along with Counter from collections module like this example:

Update: See the comments in order to understand the logic behind this solution and its limitations.

from itertools import groupby, count
from collections import Counter

def ranges_list(data=list, func=range, min_condition=1):
    # Sort in place the ranges list
    data.sort()

    # Find all the steps between the ranges's elements
    steps = [v-k for k,v in zip(data, data[1:])]

    # Find the repeated items's steps based on condition. 
    # Default: repeated more than once (min_condition = 1)
    repeated = [item for item, count in Counter(steps).items() if count > min_condition]

    # Group the items in to a dict based on the repeated steps
    groups = {k:[list(v) for _,v in groupby(data, lambda n, c = count(step = k): n-next(c))] for k in repeated}

    # Create a dict:
    # - keys are the steps
    # - values are the grouped elements
    sub = {k:[j for j in v if len(j) > 1] for k,v in groups.items()}

    # Those two lines are for pretty printing purpose:
    # They are meant to have a sorted output.
    # You can replace them by:
    # return [func(j[0], j[-1]+1,k) for k,v in sub.items() for j in v]
    # Otherwise:
    final = [(j[0], j[-1]+1,k) for k,v in sub.items() for j in v]
    return [func(*k) for k in sorted(final, key = lambda x: x[0])]

ranges1 = [1, 2, 3, 4, 6, 7, 8, 20, 24, 28, 32]
ranges2 = [1, 2, 3, 4, 6, 7, 10, 20, 24, 28, 50,51,59,60]

print(ranges_list(ranges1))
print(ranges_list(ranges2))

Output:

[range(1, 5), range(6, 9), range(20, 33, 4)]
[range(1, 5), range(6, 8), range(20, 29, 4), range(50, 52), range(59, 61)]

Limitations:

With this kind of intput:

ranges3 = [1,3,6,10]
print(ranges_list(ranges3)
print(ranges_list(ranges3, min_condition=0))

Will output:

# Steps are repeated <= 1 with the condition: min_condition = 1
# Will output an empty list
[]
# With min_condition = 0
# Will output the ranges using: zip(data, data[1:])
[range(1, 4, 2), range(3, 7, 3), range(6, 11, 4)]

Feel free to use this solution and adopt it or modify it in order to fill your needs.

Chiheb Nexus
  • 9,104
  • 4
  • 30
  • 43
  • Shouldn't the second sequence produce a `range(10, 21, 10)`? – Jared Goguen May 04 '17 at 21:17
  • Yes, when i set the condition `min_confirmation = 0` , it will output: `[range(1, 5), range(4, 7, 2), range(6, 8), range(7, 11, 3), range(10, 21, 10), range(20, 29, 4), range(28, 51, 22), range(50, 52), range(51, 60, 8), range(59, 61)]` so `range(10, 21, 10)` is included. This is listed in the third sequence under the limitation which i assume that will produce the non desired output. I'm still waiting the OP comment to keep the code like this or modify it. – Chiheb Nexus May 04 '17 at 21:23
0

It might not be super short or elegant, but it seems to work:

def ranges(ls):
    li = iter(ls)
    first = next(li)
    while True:
        try:
            element = next(li)
        except StopIteration:
            yield range(first, first+1)
            return
        step = element - first
        last = element
        while True:
            try:
                element = next(li)
            except StopIteration:
                yield range(first, last+step, step)
                return
            if element - last != step:
                yield range(first, last+step, step)
                first = element
                break
            last = element

This iterates over an iterator of the list, and yields range objects:

>>> list(ranges([1, 2, 3, 4, 6, 7, 8, 20, 24, 28, 32]))
[range(1, 5), range(6, 9), range(20, 33, 4)]

It also handles negative ranges, and ranges that have just one element:

>>> list(ranges([9,8,7, 1,3,5, 99])
[range(9, 6, -1), range(1, 7, 2), range(99, 100)]
L3viathan
  • 26,748
  • 2
  • 58
  • 81