How to split a list on an element delimiter

Question

Are there concise and elegant ways of splitting a list in Python into a list of sub-lists by a delimiting element, such that ['a', 'delim', 'b'] -> [['a'], ['b']]?

Here is the example:

ldat = ['a','b','c','a','b','c','a','b']
dlim = 'c'
lspl = []   # an elegant python one-liner wanted on this line!
print(lspl) # want: [['a', 'b'], ['a', 'b'], ['a', 'b']]

Working examples that seem overly complex

I have surveyed documentation and related questions on stackoverflow - many referenced below - which did not answer my question, and am summarizing my research below: several approaches which do generate the desired output, but are verbose and intricate, and what is happening (splitting a list) is not immediately apparent -- you really have to squint.

Are there better ways? I am primarily interested in readability for beginners (e.g. teaching), canonical / 'Pythonic' approaches, and secondarily in the most efficient approaches (e.g. timeit speed). Ideally answers would address both Python 2.7 and 3.x.

with conditional .append()

Loop through the list and either append to the last output list or add a new output list. Based on an example that includes the delimiter, but altered to exclude it. I'm not sure how to make it a one-liner, or if that is even desirable.

lspl = [[]]
for i in ldat:
    if i==dlim:
        lspl.append([])
    else:
        lspl[-1].append(i)
print(lspl) # prints: [['a', 'b'], ['a', 'b'], ['a', 'b']]

with itertools.groupby

Combine itertools.groupby with list comprehension. Many answers include delimeters, this is based on those that exclude delimeters.

import itertools
lspl = [list(y) for x, y in itertools.groupby(ldat, lambda z: z == dlim) if not x]
print(lspl) # prints: [['a', 'b'], ['a', 'b'], ['a', 'b']]

with slicing on indices

Some related questions have discussed how to use slicing after using .index() -- however answers usually focus on finding the first index only. One can extend this approach by first finding a list of indices and then looping through a self-zipped list to slice the ranges.

indices = [i for i, x in enumerate(ldat) if x == dlim]
lspl = [ldat[s+1:e] for s, e in zip([-1] + indices, indices + [len(ldat)])]
print(lspl) # prints: [['a', 'b'], ['a', 'b'], ['a', 'b']]

However, like all the approaches I have found, this seems like a very complex way of enacting a simple split-on-delimiter operation.

Comparison to string splitting

By comparison and as a model only, here is a working, concise, and elegant way of splitting a string into a list of sub-strings by a delimiter.

sdat = 'abcabcab'
dlim = 'c'
sspl = sdat.split(dlim)
print(sspl) # prints: ['ab', 'ab', 'ab']

NOTE: I understand there is no split method on lists in Python, and I am not asking about splitting a string. I am also not asking about splitting element-strings into new elements.

The slicing on indices method is what comes to mind, although that is two lines. Wrap it in a function :) then it's one line — Cory Kramer, Dec 05 '17 at 20:34
"I'm not sure how to make it a one-liner, or if that is even desirable." No, it isn't in and of itself. You can't get more canonical than a for-loop, really. In fact, the biggest problem I see with how you wrote your first example is by putting the `if` and `else` bodies on one line - use indentation. — juanpa.arrivillaga, Dec 05 '17 at 20:35
Really any behavior is fine. So `['delim', 'a', 'b', 'delim']` could become `[['a'], ['b']]`, or `[[], ['a'], ['b'], []]`, or even `[['a'], ['b'], []]` or `[[], ['a'], ['b']]`. — JeremyDouglass, Dec 05 '17 at 20:38
@juanpa.arrivillaga I have indented the if/else example for legibility -- good point. — JeremyDouglass, Dec 05 '17 at 20:40
@JeremyDouglass right, that solution is *perfectly* Pythonic. It is very readable, the logic is straight-forward and expressed in typical python idioms, e.g. "append to the last sublist" => `lspl[-1].append(i)`. It is also *performant*. — juanpa.arrivillaga, Dec 05 '17 at 20:43
@juanpa.arrivillaga Thank you, I appreciate your point on its virtues. I am still hoping for even clearer alternatives. I am working on code for use by first-time programmers, and there is nothing *intuitive* about a conditional negative append = splitting. — JeremyDouglass, Dec 05 '17 at 20:51

score -4 · Answer 1 · answered Dec 05 '17 at 20:37

-4

or this:

ldat = ['a','b','c','a','b','c','a','b']
dlim = 'c'
lspl = []   # an elegant python one-liner wanted on this line!
print(lspl) # want: [['a', 'b'], ['a', 'b'], ['a', 'b']]

s = str(ldat).replace(", '%s', " % dlim, "],[")
result = eval(s)
print(result)

answered Dec 05 '17 at 20:37

user508402

496
1
4
19

1

Do The Simplest Thing That Could Possibly Work. – user508402 Dec 05 '17 at 20:48
3

https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html – roganjosh Dec 05 '17 at 20:50
3

Wow. That does 'work' on that specific case, and it is simple, but it might be a bad combination of opaque and dangerous -- I really hope beginners won't cut-and-paste from this answer. – JeremyDouglass Dec 05 '17 at 21:01