1

Given a string:

x = 'foo test1 test1 foo test2 foo'  

I want to partition the string by foo, so that I get something along the lines of:

['foo', 'test1 test1 foo', 'test2 foo'] (preferred)

                 or

[['foo'], ['test1', 'test1', 'foo'], ['test2', 'foo']]  (not preferred, but workable)

I've tried itertools.groupby:

In [1209]: [list(v) for _, v in itertools.groupby(x.split(), lambda k: k != 'foo')]
Out[1209]: [['foo'], ['test1', 'test1'], ['foo'], ['test2'], ['foo']]

But it doesn't exactly give me what I'm looking for. I know I could use a loop and do this:

In [1210]: l = [[]]
      ...: for v in x.split():
      ...:     l[-1].append(v)
      ...:     if v == 'foo':
      ...:         l.append([])
      ...:     

In [1211]: l
Out[1211]: [['foo'], ['test1', 'test1', 'foo'], ['test2', 'foo'], []]

But it isn't very efficient leaves the empty list at the end. Is there a simpler way?

I want to retain the delimiter.

cs95
  • 379,657
  • 97
  • 704
  • 746

5 Answers5

3

Maybe not the prettiest approach, but concise and straightfoward:

[part + 'foo' for part in g.split('foo')][:-1]

Output:

['foo', ' test1 test1 foo', ' test2 foo']
perigon
  • 2,160
  • 11
  • 16
3

You can use str.partition for your case :

def find_foo(x):
    result = []
    while x:
        before, _, x = x.partition("foo")
        result.append(before + "foo")
    return result

>>> find_foo('foo test1 test1 foo test2 foo')
>>> ['foo', ' test1 test1 foo', ' test2 foo']
Cédric Julien
  • 78,516
  • 15
  • 127
  • 132
1

Had you thought about iterating over the string and using a start position for your searches? This can often turn out to be faster than chopping the strings up as you go. This might work for you:

x = 'foo test1 test1 foo test2 foo'  

def findall(target, s):
    lt =len(target)
    ls = len(s)
    pos = 0
    result = []
    while pos < ls:
        fpos = s.find(target, pos)+lt
        result.append(s[pos:fpos])
        pos = fpos
    return result

print(findall("foo", x))
holdenweb
  • 33,305
  • 7
  • 57
  • 77
1

You could use look behind positive (?<=) regex expression like

In [515]: string = 'foo test1 test1 foo test2 foo'

In [516]: re.split('(?<=foo)\s', string)
Out[516]: ['foo', 'test1 test1 foo', 'test2 foo']

And,

In [517]: [x.split() for x in re.split('(?<=foo)\s', string)]
Out[517]: [['foo'], ['test1', 'test1', 'foo'], ['test2', 'foo']]
Zero
  • 74,117
  • 18
  • 147
  • 154
0

Try this one

x = 'foo test1 test1 foo test2 foo'  

word = 'foo'
out = []
while word in x:
    pos = x.index(word)
    l = len(word)
    out.append( x[:int(pos)+l])
    x = x[int(pos)+l:]

print out

Output :

['foo', ' test1 test1 foo', ' test2 foo']
Kallz
  • 3,244
  • 1
  • 20
  • 38