3

I have a string that I'd like to split in specific places into a list of strings. The split points are stored in a separate split list. For example:

test_string = "thequickbrownfoxjumpsoverthelazydog"
split_points = [0, 3, 8, 13, 16, 21, 25, 28, 32]

...should return:

>>> ['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']

So far I have this as the solution, but it looks incredibly convoluted for how simple the task is:

split_points.append(len(test_string))
print [test_string[start_token:end_token] for start_token, end_token in [(split_points[i], split_points[i+1]) for i in xrange(len(split_points)-1)]]

Any good string functions that do the job, or is this the easiest way?

Thanks in advance!

Raven
  • 648
  • 1
  • 7
  • 18

3 Answers3

7

Like this?

>>> map(lambda x: test_string[slice(*x)], zip(split_points, split_points[1:]+[None]))
['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']

We're ziping split_points with a shifted self, to create a list of all consecutive pairs of slice indexes, like [(0,3), (3,8), ...]. We need to add the last slice (32,None) manually, since zip terminates when the shortest sequence is exhausted.

Then we map over that list a simple lambda slicer. Note the slice(*x) which creates a slice object, e.g. slice(0, 3, None) which we can use to slice the sequence (string) with standard the item getter (__getslice__ in Python 2).

A little bit more Pythonic implementation could use a list comprehension instead of map+lambda:

>>> [test_string[i:j] for i,j in zip(split_points, split_points[1:] + [None])]
['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
randomir
  • 17,989
  • 1
  • 40
  • 55
  • 1
    Why the lambda? You can unpack the tuple in a list comprehension and just do `[test_string[i:j] for i,j in zip(split_points, split_points[1:] + [None])]` – tobias_k Jul 14 '17 at 14:04
  • You're right, when thinking "functional", I always astray to `map` and `lambda`. `:-)` – randomir Jul 14 '17 at 14:09
2

This may be less convoluted:

>> test_string = "thequickbrownfoxjumpsoverthelazydog"
>> split_points = [0, 3, 8, 13, 16, 21, 25, 28, 32]
>> split_points.append(len(test_string))
>> print([test_string[i: j] for i, j in zip(split_points, split_points[1:])])
['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
DeepSpace
  • 78,697
  • 11
  • 109
  • 154
0

First draft:

for idx, i in enumerate(split_points):
    try:
        print(test_string[i:split_points[idx+1]])
    except IndexError:
        print(test_string[i:])
kchomski
  • 2,872
  • 20
  • 31