1

I have a list that contains a random number of ints. I would like to iterate over this list, and if a number and the successive number are within one numeric step of one another, I would like to concatenate them into a sublist.

For example:

input = [1,2,4,6,7,8,10,11]
output = [[1,2],[4],[6,7,8],[10,11]]

The input list will always contain positive ints sorted in increasing order. I tried some of the code from here.

initerator = iter(inputList)
outputList = [c + next(initerator, "") for c in initerator]

Although I can concat every two entries in the list, I cannot seem to add a suitable if in the list comprehension.

Python version = 3.4

Community
  • 1
  • 1
Eoin
  • 357
  • 1
  • 4
  • 20
  • 3
    One-liner: `[[x for i, x in grp] for _, grp in itertools.groupby(enumerate(inputList), lambda x: x[1] - x[0])]`. – Mark Dickinson Aug 03 '16 at 10:09
  • thank you, this looks good......and I kind of understand the workings.......however what if I prepend each entry with a letter so I have input = [a1,a2,a4,b6,c7,c8,c10,d11], I also would like to group by character resulting in [[a1,a2],[a4],[b6],[c7,c8],[c10],[d11]]....I don't fully understand the "groupby" and "enumerate" methods which is probably why I can't get this...thanks again – Eoin Aug 03 '16 at 10:31

3 Answers3

1

Nice way (found the "splitting" indices and then slice:

input = [1,2,4,6,7,8,10,11]
idx = [0] + [i+1 for i,(x,y) in enumerate(zip(input,input[1:])) if x+1!=y] + [len(input)]
[ input[u:v] for u,v in zip(idx, idx[1:]) ]
#output:
[[1, 2], [4], [6, 7, 8], [10, 11]]

using enumerate() and zip().

Graham
  • 7,431
  • 18
  • 59
  • 84
Ohad Eytan
  • 8,114
  • 1
  • 22
  • 31
1

Unless you have to have a one-liner, you could use a simple generator function, combining elements until you hit a non consecutive element:

def consec(lst):
    it = iter(lst)
    prev = next(it)
    tmp = [prev]
    for ele in it:
        if prev + 1 != ele:
            yield tmp
            tmp = [ele]
        else:
            tmp.append(ele)
        prev = ele
    yield tmp

Output:

In [2]: lst = [1, 2, 4, 6, 7, 8, 10, 11]

In [3]: list(consec(lst))
Out[3]: [[1, 2], [4], [6, 7, 8], [10, 11]]
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

Simplest version I have without any imports:

def mergeAdjNum(l):
    r = [[l[0]]]
    for e in l[1:]:
        if r[-1][-1] == e - 1:
            r[-1].append(e)
        else:
            r.append([e])
    return r

About 33% faster than one liners.

This one handles the character prefix grouping mentioned in a comment:

def groupPrefStr(l):
    pattern = re.compile(r'([a-z]+)([0-9]+)')
    r = [[l[0]]]
    pp, vv = re.match(pattern, l[0]).groups()
    vv = int(vv)
    for e in l[1:]:
        p,v = re.match(pattern, e).groups()
        v = int(v)
        if p == pp and v == vv + 1:
            r[-1].append(e)
        else:
            pp, vv = p, v
            r.append([e])
    return r

This is way slower than the number only one. Knowing the exact format of the prefix (only one char ?) could help avoid using the re module and speed things up.

guillaume.deslandes
  • 1,191
  • 9
  • 12