1
import re

p = re.compile(r"([?.;])")

ss = re.split(p, 'This is a test? This is a test?good.bad')

for s in ss:
    print(s)

The result is:

This is a test
?
 This is a test
?
good
.
bad

I hope the result would be:

This is a test?
This is a test?
good.
bad

Why does it put the delimiter on another line?

EDIT: I think I understand why it did that. The question is how to produce the result I want.

yatu
  • 86,083
  • 12
  • 84
  • 139
marlon
  • 6,029
  • 8
  • 42
  • 76

2 Answers2

2

You can join back the delimiters and preceding items:

 ss = re.split(p, 'This is a test? This is a test?good.bad')
 result = [ a+b for a, b in zip(ss[::2], ss[1::2]) ] + (ss[-1:] if len(ss) % 2 else [])
Błotosmętek
  • 12,717
  • 19
  • 29
  • ['Th', 'is', ' i', 's ', 'a ', 'te', 'st', '? ', 'Th', 'is', ' i', 's ', 'a ', 'te', 'st', '?g', 'oo', 'd.', 'ba'] result – marlon Apr 13 '20 at 16:44
  • Nice! I get ['This is a test?', ' This is a test?', 'good.'] It only misses the last 'bad' – Mace Apr 13 '20 at 16:48
  • Ah yes, I assumed even number of items in `ss`. Fixed. – Błotosmętek Apr 13 '20 at 16:51
  • Not quite right. It gives an empty '' if 'bad' is removed from the orignal string. You can test. – marlon Apr 13 '20 at 16:55
  • Well, if there;s a dot after `good`, there's an empty string after the dot…This can be avoided using additional check: `result = [ a+b for a, b in zip(ss[::2], ss[1::2]) ] + (ss[-1:] if ss[-1] else []) if len(ss) % 2 else [] ` – Błotosmętek Apr 13 '20 at 16:58
2

A comment said you must use the pattern p. Here's a way to join the pairs up after a split. zip_longest ensures an odd pairing works out by returning None for the second element, which is converted to an empty string if present.

import re
from itertools import zip_longest

p = re.compile(r"([?.;])")

ss = re.split(p, 'This is a test? This is a test?good.bad')

for a,b in zip_longest(ss[::2],ss[1::2]):
    print(a+(b if b else ''))

Output:

This is a test?
 This is a test?
good.
bad
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251