Are there concise and elegant ways of splitting a list in Python into a list of sub-lists by a delimiting element, such that ['a', 'delim', 'b']
-> [['a'], ['b']]
?
Here is the example:
ldat = ['a','b','c','a','b','c','a','b']
dlim = 'c'
lspl = [] # an elegant python one-liner wanted on this line!
print(lspl) # want: [['a', 'b'], ['a', 'b'], ['a', 'b']]
Working examples that seem overly complex
I have surveyed documentation and related questions on stackoverflow - many referenced below - which did not answer my question, and am summarizing my research below: several approaches which do generate the desired output, but are verbose and intricate, and what is happening (splitting a list) is not immediately apparent -- you really have to squint.
Are there better ways? I am primarily interested in readability for beginners (e.g. teaching), canonical / 'Pythonic' approaches, and secondarily in the most efficient approaches (e.g. timeit speed). Ideally answers would address both Python 2.7 and 3.x.
with conditional .append()
Loop through the list and either append to the last output list or add a new output list. Based on an example that includes the delimiter, but altered to exclude it. I'm not sure how to make it a one-liner, or if that is even desirable.
lspl = [[]]
for i in ldat:
if i==dlim:
lspl.append([])
else:
lspl[-1].append(i)
print(lspl) # prints: [['a', 'b'], ['a', 'b'], ['a', 'b']]
with itertools.groupby
Combine itertools.groupby with list comprehension. Many answers include delimeters, this is based on those that exclude delimeters.
import itertools
lspl = [list(y) for x, y in itertools.groupby(ldat, lambda z: z == dlim) if not x]
print(lspl) # prints: [['a', 'b'], ['a', 'b'], ['a', 'b']]
with slicing on indices
Some related questions have discussed how to use slicing after using .index() -- however answers usually focus on finding the first index only. One can extend this approach by first finding a list of indices and then looping through a self-zipped list to slice the ranges.
indices = [i for i, x in enumerate(ldat) if x == dlim]
lspl = [ldat[s+1:e] for s, e in zip([-1] + indices, indices + [len(ldat)])]
print(lspl) # prints: [['a', 'b'], ['a', 'b'], ['a', 'b']]
However, like all the approaches I have found, this seems like a very complex way of enacting a simple split-on-delimiter operation.
Comparison to string splitting
By comparison and as a model only, here is a working, concise, and elegant way of splitting a string into a list of sub-strings by a delimiter.
sdat = 'abcabcab'
dlim = 'c'
sspl = sdat.split(dlim)
print(sspl) # prints: ['ab', 'ab', 'ab']
NOTE: I understand there is no split
method on lists in Python, and I am not asking about splitting a string. I am also not asking about splitting element-strings into new elements.