79

I have a list containing data as such:

[1, 2, 3, 4, 7, 8, 10, 11, 12, 13, 14]

I'd like to print out the ranges of consecutive integers:

1-4, 7-8, 10-14

Is there a built-in/fast/efficient way of doing this?

James
  • 815
  • 1
  • 8
  • 6
  • See http://stackoverflow.com/questions/2154249/identify-groups-of-continuous-numbers-in-a-list, which points you to http://docs.python.org/library/itertools.html#examples – Dominic Rodger Mar 02 '10 at 09:14
  • 1
    Homework? You show us what you've tried and we'll see if we can do better. – John Machin Mar 02 '10 at 09:16
  • Thank you, I should have found that before asking :/ – James Mar 02 '10 at 09:18
  • no problem, it wasn't that easy to find - I just happen to remember seeing it. Your question isn't an exact duplicate, since your desired output is a bit different. – Dominic Rodger Mar 02 '10 at 09:20

8 Answers8

123

From the docs:

>>> from itertools import groupby
>>> from operator import itemgetter
>>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
>>> for k, g in groupby(enumerate(data), lambda (i, x): i-x):
...     print map(itemgetter(1), g)
...
[1]
[4, 5, 6]
[10]
[15, 16, 17, 18]
[22]
[25, 26, 27, 28]

You can adapt this fairly easily to get a printed set of ranges.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
Dominic Rodger
  • 97,747
  • 36
  • 197
  • 212
  • Don't forget to `import itertools`. Also, this only works with Python 2.4 and higher. – Gabe Mar 02 '10 at 09:48
  • 1
    actually you'll need `from itertools import *` and `from operator import *` (or equivalent), at least in Python 2.6. – Andre Holzner Apr 11 '11 at 11:12
  • 40
    Don't use star imports! **Never** use star imports! Use `from itertools import groupby` and `from operator import itemgetter` instead. – Danilo Bargen Aug 28 '13 at 20:41
  • 41
    Change the lambda to `lambda ix : ix[0] - ix[1]` and it works in both Python 3 and Python 2 (well, not counting the print statement). – Kevin May 20 '15 at 04:17
  • 18
    I was about to upvote this answer because of how clever it is. Unfortunately, it's _too_ clever for me to upvote without it having an explanation of what the code is doing/how it works. – mgilson Nov 24 '15 at 16:52
  • If you have duplicates in your input list (e.g. [0, 1, 1, 2]) this approach won't cluster the full sequence; you need to call list(set(my_original_list)) to your list before providing it to this function to get a streak of 0,1,2 returned – duhaime Feb 09 '17 at 19:23
  • 14
    For all those that are trying the code for Python 3, read @Kevin 's comment. Also, the print statement won't work because you actually need to use `list()`, as you can see here https://stackoverflow.com/questions/7731213/print-doesnt-print-when-its-in-map-python basically you should use `print(list(map(itemgetter(1), g)))` in Python 3 – Euler_Salter Mar 20 '18 at 09:49
  • @dominic-rodger, how could you pass the index along from the groupby(), so that you get the list of values but another list of the corresponding indexes? I feel this is possible since you are enumerating(), but can't make it happen. – ratchet Jan 17 '19 at 21:34
  • for k, g in groupby(enumerate(n_matline), lambda ix : ix[0] - ix[1]): print (list(map(itemgetter(1), g))) For python 3 – Tamil Selvan S Nov 28 '22 at 12:18
28

A short solution that works without additional imports. It accepts any iterable, sorts unsorted inputs, and removes duplicate items:

def ranges(nums):
    nums = sorted(set(nums))
    gaps = [[s, e] for s, e in zip(nums, nums[1:]) if s+1 < e]
    edges = iter(nums[:1] + sum(gaps, []) + nums[-1:])
    return list(zip(edges, edges))

Example:

>>> ranges([2, 3, 4, 7, 8, 9, 15])
[(2, 4), (7, 9), (15, 15)]

>>> ranges([-1, 0, 1, 2, 3, 12, 13, 15, 100])
[(-1, 3), (12, 13), (15, 15), (100, 100)]

>>> ranges(range(100))
[(0, 99)]

>>> ranges([0])
[(0, 0)]

>>> ranges([])
[]

This is the same as @dansalmo's solution which I found amazing, albeit a bit hard to read and apply (as it's not given as a function).

Note that it could easily be modified to spit out "traditional" open ranges [start, end), by e.g. altering the return statement:

    return [(s, e+1) for s, e in zip(edges, edges)]
coldfix
  • 6,604
  • 3
  • 40
  • 50
11

This will print exactly as you specified:

>>> nums = [1, 2, 3, 4, 7, 8, 10, 11, 12, 13, 14]
>>> ranges = sum((list(t) for t in zip(nums, nums[1:]) if t[0]+1 != t[1]), [])
>>> iranges = iter(nums[0:1] + ranges + nums[-1:])
>>> print ', '.join([str(n) + '-' + str(next(iranges)) for n in iranges])
1-4, 7-8, 10-14

If the list has any single number ranges, they would be shown as n-n:

>>> nums = [1, 2, 3, 4, 5, 7, 8, 9, 12, 15, 16, 17, 18]
>>> ranges = sum((list(t) for t in zip(nums, nums[1:]) if t[0]+1 != t[1]), [])
>>> iranges = iter(nums[0:1] + ranges + nums[-1:])
>>> print ', '.join([str(n) + '-' + str(next(iranges)) for n in iranges])
1-5, 7-9, 12-12, 15-18
dansalmo
  • 11,506
  • 5
  • 58
  • 53
3

Built-In: No, as far as I'm aware.

You have to run through the array. Start off with putting the first value in a variable and print it, then as long as you keep hitting the next number do nothing but remember the last number in another variable. If the next number is not in line, check the last number remembered versus the first number. If it's the same, do nothing. If it's different, print "-" and the last number. Then put the current value in the first variable and start over. At the end of the array you run the same routine as if you had hit a number out of line.

I could have written the code, of course, but I don't want to spoil your homework :-)

TToni
  • 9,145
  • 1
  • 28
  • 42
0

I had a similar problem and am using the following for a sorted list. It outputs a dictionary with ranges of values listed in a dictionary. The keys separate each run of consecutive numbers and are also the running total of non-sequential items between numbers in sequence.

Your list gives me an output of {0: [1, 4], 1: [7, 8], 2: [10, 14]}

def series_dictf(index_list):
    from collections import defaultdict    
    series_dict = defaultdict(list)
    sequence_dict = dict()

    list_len = len(index_list)
    series_interrupts = 0    

    for i in range(list_len):
        if i == (list_len - 1):
                break

        position_a = index_list[i]
        position_b = index_list[i + 1]

        if position_b == (position_a + 1):
            sequence_dict[position_a] = (series_interrupts)
            sequence_dict[position_b] = (series_interrupts)

        if position_b != (position_a + 1):
            series_interrupts += 1  

    for position, series in sequence_dict.items():
        series_dict[series].append(position)
    for series, position in series_dict.items():
        series_dict[series] = [position[0], position[-1]]

    return series_dict
0

Using set operation, the following algorithm can be executed

def get_consecutive_integer_series(integer_list):
    integer_list = sorted(integer_list)
    start_item = integer_list[0]
    end_item = integer_list[-1]

    a = set(integer_list)  # Set a
    b = range(start_item, end_item+1)

    # Pick items that are not in range.
    c = set(b) - a  # Set operation b-a

    li = []
    start = 0
    for i in sorted(c):
        end = b.index(i)  # Get end point of the list slicing
        li.append(b[start:end])  # Slice list using values
        start = end + 1  # Increment the start point for next slicing
    li.append(b[start:])  # Add the last series

    for sliced_list in li:
        if not sliced_list:
            # list is empty
            continue
        if len(sliced_list) == 1:
            # If only one item found in list
            yield sliced_list[0]
        else:
            yield "{0}-{1}".format(sliced_list[0], sliced_list[-1])


a = [1, 2, 3, 6, 7, 8, 4, 14, 15, 21]
for series in get_consecutive_integer_series(a):
    print series

Output for the above list "a"
1-4
6-8
14-15
21

theBuzzyCoder
  • 2,652
  • 2
  • 31
  • 26
-1

Here is another basic solution without using any module, which is good for interview, generally in the interview they asked without using any modules:

#!/usr/bin/python

def split_list(n):
    """will return the list index"""
    return [(x+1) for x,y in zip(n, n[1:]) if y-x != 1]

def get_sub_list(my_list):
    """will split the list base on the index"""
    my_index = split_list(my_list)
    output = list()
    prev = 0
    for index in my_index:
        new_list = [ x for x in my_list[prev:] if x < index]
        output.append(new_list)
        prev += len(new_list)
    output.append([ x for x in my_list[prev:]])
    return output

my_list = [1, 3, 4, 7, 8, 10, 11, 13, 14]
print get_sub_list(my_list)

Output:

[[1], [3, 4], [7, 8], [10, 11], [13, 14]]
James Sapam
  • 16,036
  • 12
  • 50
  • 73
-2

You can use collections library which has a class called Counter. Counter can come in handy if trying to poll the no of distinct elements in any iterable

from collections import Counter
data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
cnt=Counter(data)
print(cnt)

the output for this looks like

Counter({1: 1, 4: 1, 5: 1, 6: 1, 10: 1, 15: 1, 16: 1, 17: 1, 18: 1, 22: 1, 25: 1, 26: 1, 27: 1, 28: 1})

which just like any other dictionary can be polled for key values