21

I have a list of integers which I need to parse into a string of ranges.

For example:

 [0, 1, 2, 3] -> "0-3"
 [0, 1, 2, 4, 8] -> "0-2,4,8"

And so on.

I'm still learning more pythonic ways of handling lists, and this one is a bit difficult for me. My latest thought was to create a list of lists which keeps track of paired numbers:

[ [0, 3], [4, 4], [5, 9], [20, 20] ]

I could then iterate across this structure, printing each sub-list as either a range, or a single value.

I don't like doing this in two iterations, but I can't seem to keep track of each number within each iteration. My thought would be to do something like this:

Here's my most recent attempt. It works, but I'm not fully satisfied; I keep thinking there's a more elegant solution which completely escapes me. The string-handling iteration isn't the nicest, I know -- it's pretty early in the morning for me :)

def createRangeString(zones):
        rangeIdx = 0
        ranges   = [[zones[0], zones[0]]]
        for zone in list(zones):
            if ranges[rangeIdx][1] in (zone, zone-1):
                ranges[rangeIdx][1] = zone
            else:
                ranges.append([zone, zone])
                rangeIdx += 1

        rangeStr = ""
        for range in ranges:
            if range[0] != range[1]:
                rangeStr = "%s,%d-%d" % (rangeStr, range[0], range[1])
            else:
                rangeStr = "%s,%d" % (rangeStr, range[0])

        return rangeStr[1:]

Is there a straightforward way I can merge this into a single iteration? What else could I do to make it more Pythonic?

bedwyr
  • 5,774
  • 4
  • 31
  • 49
  • See a demonstration of the `more_itertools.consecutive_groups` tool [here](https://stackoverflow.com/a/47642650/4531270). – pylang Dec 04 '17 at 22:03

7 Answers7

22
>>> from itertools import count, groupby
>>> L=[1, 2, 3, 4, 6, 7, 8, 9, 12, 13, 19, 20, 22, 23, 40, 44]
>>> G=(list(x) for _,x in groupby(L, lambda x,c=count(): next(c)-x))
>>> print ",".join("-".join(map(str,(g[0],g[-1])[:len(g)])) for g in G)
1-4,6-9,12-13,19-20,22-23,40,44

The idea here is to pair each element with count(). Then the difference between the value and count() is constant for consecutive values. groupby() does the rest of the work

As Jeff suggests, an alternative to count() is to use enumerate(). This adds some extra cruft that needs to be stripped out in the print statement

G=(list(x) for _,x in groupby(enumerate(L), lambda (i,x):i-x))
print ",".join("-".join(map(str,(g[0][1],g[-1][1])[:len(g)])) for g in G)

Update: for the sample list given here, the version with enumerate runs about 5% slower than the version using count() on my computer

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • Try with L = [1,2,3,4,7,9,11,13,14,15]. It will probably break if I understood you correctly. – Muhammad Alkarouri Aug 07 '10 at 11:54
  • 4
    Can I take that back? Brilliant solution, I tip my hat to you. – Muhammad Alkarouri Aug 07 '10 at 12:05
  • @Muhammad, of course :) check out the updated version - it no longer requires `izip` – John La Rooy Aug 07 '10 at 12:32
  • @gnibbler, this looks amazing. I like your use of map in the print statement. I don't think I've come across the next() method. Can you pass any iterable item to it? – bedwyr Aug 07 '10 at 16:15
  • @gnibbler: Impressive. Can you use `G=(list(x) for _,x in ..` before I hang it on my wall? These days I am studying itertools and looking for such gems. – Muhammad Alkarouri Aug 07 '10 at 21:09
  • @bedwyr, the function `next()` was introduced in Python2.6. You can pass any iterator to it – John La Rooy Aug 07 '10 at 22:45
  • That's a great solution. Though I was thinking, wouldn't enumerating through the list have the same effect as using count()? Or would either approach have an effect on performance? – Jeff Mercado Aug 08 '10 at 00:22
  • @Jeff, I added a version using enumerate. Not sure which would be faster – John La Rooy Aug 08 '10 at 01:41
  • This is great. I think it may fail the test for pythonicity on the "flat is better than nested" clause, or maybe the explicit/implicit one, but it's still a +1. Probably for maximum pythonicity you would pull some of the enclosed elements out into assignments or `def` s. As is it seems more geared for golficity. – intuited Aug 08 '10 at 04:29
  • I found that sorting the list was a requirement for this solution to work. – paragbaxi Mar 30 '12 at 21:10
  • @ghee22, yes it expects a presorted list – John La Rooy Mar 30 '12 at 22:19
3

Whether this is pythonic is up for debate. But it is very compact. The real meat is in the Rangify() function. There's still room for improvement if you want efficiency or Pythonism.

def CreateRangeString(zones):
    #assuming sorted and distinct
    deltas = [a-b for a, b in zip(zones[1:], zones[:-1])]
    deltas.append(-1)
    def Rangify((b, p), (z, d)):
        if p is not None:
            if d == 1: return (b, p)
            b.append('%d-%d'%(p,z))
            return (b, None)
        else:
            if d == 1: return (b, z)
            b.append(str(z))
            return (b, None)
    return ','.join(reduce(Rangify, zip(zones, deltas), ([], None))[0])

To describe the parameters:

  • deltas is the distance to the next value (inspired from an answer here on SO)
  • Rangify() does the reduction on these parameters
    • b - base or accumulator
    • p - previous start range
    • z - zone number
    • d - delta
Community
  • 1
  • 1
Jeff Mercado
  • 129,526
  • 32
  • 251
  • 272
  • you can assume distinct, but not necessarily sorted. Since the list is numeric, however, the sort() method makes resolving the assumption fairly trivial. – bedwyr Aug 07 '10 at 16:16
  • Actually, I take that back - my code which pushes the list into the function performs a pre-sort. Never mind :) – bedwyr Aug 07 '10 at 18:06
1

To concatenate strings you should use ','.join. This removes the 2nd loop.

def createRangeString(zones):
        rangeIdx = 0
        ranges   = [[zones[0], zones[0]]]
        for zone in list(zones):
            if ranges[rangeIdx][1] in (zone, zone-1):
                ranges[rangeIdx][1] = zone
            else:
                ranges.append([zone, zone])
                rangeIdx += 1

       return ','.join(
                map(
                  lambda p: '%s-%s'%tuple(p) if p[0] != p[1] else str(p[0]),
                  ranges
                )
              )

Although I prefer a more generic approach:

from itertools import groupby

# auxiliary functor to allow groupby to compare by adjacent elements.
class cmp_to_groupby_key(object):
  def __init__(self, f):
    self.f = f
    self.uninitialized = True
  def __call__(self, newv):
    if self.uninitialized or not self.f(self.oldv, newv):
      self.curkey = newv
      self.uninitialized = False
    self.oldv = newv
    return self.curkey

# returns the first and last element of an iterable with O(1) memory.
def first_and_last(iterable):
  first = next(iterable)
  last = first
  for i in iterable:
    last = i
  return (first, last)

# convert groups into list of range strings
def create_range_string_from_groups(groups):
  for _, g in groups:
    first, last = first_and_last(g)
    if first != last:
      yield "{0}-{1}".format(first, last)
    else:
      yield str(first)

def create_range_string(zones):
  groups = groupby(zones, cmp_to_groupby_key(lambda a,b: b-a<=1))
  return ','.join(create_range_string_from_groups(groups))

assert create_range_string([0,1,2,3]) == '0-3'
assert create_range_string([0, 1, 2, 4, 8]) == '0-2,4,8'
assert create_range_string([1,2,3,4,6,7,8,9,12,13,19,20,22,22,22,23,40,44]) == '1-4,6-9,12-13,19-20,22-23,40,44'
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
1

This is more verbose, mainly because I have used generic functions that I have and that are minor variations of itertools functions and recipes:

from itertools import tee, izip_longest
def pairwise_longest(iterable):
    "variation of pairwise in http://docs.python.org/library/itertools.html#recipes"
    a, b = tee(iterable)
    next(b, None)
    return izip_longest(a, b)

def takeuntil(predicate, iterable):
    """returns all elements before and including the one for which the predicate is true
    variation of http://docs.python.org/library/itertools.html#itertools.takewhile"""
    for x in iterable:
        yield x
        if predicate(x):
            break

def get_range(it):
    "gets a range from a pairwise iterator"
    rng = list(takeuntil(lambda (a,b): (b is None) or (b-a>1), it))
    if rng:
        b, e = rng[0][0], rng[-1][0]
        return "%d-%d" % (b,e) if b != e else "%d" % b

def create_ranges(zones):
    it = pairwise_longest(zones)
    return ",".join(iter(lambda:get_range(it),None))

k=[0,1,2,4,5,7,9,12,13,14,15]
print create_ranges(k) #0-2,4-5,7,9,12-15
Muhammad Alkarouri
  • 23,884
  • 19
  • 66
  • 101
1

how about this mess...

def rangefy(mylist):
    mylist, mystr, start = mylist + [None], "", 0
    for i, v in enumerate(mylist[:-1]):
            if mylist[i+1] != v + 1:
                    mystr += ["%d,"%v,"%d-%d,"%(start,v)][start!=v]
                    start = mylist[i+1]
    return mystr[:-1]
Lord British
  • 405
  • 3
  • 10
  • Does anyone have any idea how to rewrite using join or how to avoid summing None or how to make the above code shorter? – Lord British Sep 10 '10 at 15:51
0

Here is my solution. You need to keep track of various pieces of information while you iterate through the list and create the result - this screams generator to me. So here goes:

def rangeStr(start, end):
    '''convert two integers into a range start-end, or a single value if they are the same''' 
    return str(start) if start == end else "%s-%s" %(start, end)

def makeRange(seq):
    '''take a sequence of ints and return a sequence
    of strings with the ranges
    '''
    # make sure that seq is an iterator
    seq = iter(seq)
    start = seq.next()
    current = start
    for val in seq:
        current += 1
        if val != current:
            yield rangeStr(start, current-1)
            start = current = val
    # make sure the last range is included in the output
    yield rangeStr(start, current)

def stringifyRanges(seq):
    return ','.join(makeRange(seq))

>>> l = [1,2,3, 7,8,9, 11, 20,21,22,23]
>>> l2 = [1,2,3, 7,8,9, 11, 20,21,22,23, 30]
>>> stringifyRanges(l)
'1-3,7-9,11,20-23'
>>> stringifyRanges(l2)
'1-3,7-9,11,20-23,30'

My version will work correctly if given an empty list, which I think some of the others will not.

>>> stringifyRanges( [] )
''

makeRanges will work on any iterator that returns integers and lazily returns a sequence of strings so can be used on infinite sequences.

edit: I have updated the code to handle single numbers that are not part of a range.

edit2: refactored out rangeStr to remove duplication.

Dave Kirby
  • 25,806
  • 5
  • 67
  • 84
0
def createRangeString(zones):
    """Create a string with integer ranges in the format of '%d-%d'
    >>> createRangeString([0, 1, 2, 4, 8])
    "0-2,4,8"
    >>> createRangeString([1,2,3,4,6,7,8,9,12,13,19,20,22,22,22,23,40,44])
    "1-4,6-9,12-13,19-20,22-23,40,44"
    """
    buffer = []

    try:
        st = ed = zones[0]
        for i in zones[1:]:
            delta = i - ed
            if delta == 1: ed = i
            elif not (delta == 0):
                buffer.append((st, ed))
                st = ed = i
        else: buffer.append((st, ed))
    except IndexError:
        pass

    return ','.join(
            "%d" % st if st==ed else "%d-%d" % (st, ed)
            for st, ed in buffer)
satoru
  • 31,822
  • 31
  • 91
  • 141