How to re.split(" ", string) when there are multiple spaces?

Question

d = ["11:00 PM", "13!00 PM", "11 00 PM"]

for i in d:
    print(re.split(" ", i)

For others it's okay, prints ["11:00", "PM"] and ["13!00", "PM"], but for 11 00 PM it prints ["11", "00", "PM"]. How can I change the regex so it would return ["11 00", "PM"] there too?

use normal split - `"11 00 PM".rsplit(' ', 1)` - it means `i.rsplit(' ', 1)` — furas, Oct 15 '19 at 18:06
You could try a more advanced regex, perhaps something like here (check out the demos!): https://stackoverflow.com/questions/12683201/python-re-split-to-split-by-spaces-commas-and-periods-but-not-in-cases-like — mgrollins, Oct 15 '19 at 18:08
Or simply: `print(re.split("\s+(?=AM|PM)", i))` which is being a little bit more specific, asserting that we only split next to AM/PM — Mako212, Oct 15 '19 at 18:36

furas · Answer 1 · 2019-10-15T18:30:32.033

7

For these examples you can use normal text.split() (or rather text.rsplit()) instead of regex

d = ["11:00 PM", "13!00 PM", "11 00 PM"]

for i in d:
    print(i.rsplit(" ", 1))

Result:

['11:00', 'PM']
['13!00', 'PM']
['11 00', 'PM']

EDIT:

if you want to remove "white chars" - space, tab, enter - at both sides then you can use normal text.strip(). Similar rstrip() and lstrip() for right or left side. Or use strip(' ') if you want to remove only spaces and keep tabs and enters.

i = i.strip().rsplit(" ", 1)

d = [" 11:00 PM", "\n 13!00 PM", "11 00 PM"]

for i in d:
    print(i.strip().rsplit(" ", 1))

EDIT: If you want to keep result then you can append to list

d = [" 11:00 PM", "\n 13!00 PM", "11 00 PM"]

results = []    

for i in d:
    results.append(i.strip().rsplit(" ", 1))

print(results)

or you can use list comprehension as @Alexander said in comment

d = [" 11:00 PM", "\n 13!00 PM", "11 00 PM"]

results = [x.strip().rsplit(maxsplit=1) for x in d]

print(results)

edited Oct 15 '19 at 18:30

answered Oct 15 '19 at 18:08

furas

134,197
12
106
148

1

Forgot to mention that some of those strings may have whitespace/newline before them. How to get rid of those then? ``` [" 11:00 PM", "\n 13!00 PM", "11 00 PM"] – KarlJoosepK Oct 15 '19 at 18:10
1

`rsplit()` check spaces from right to left so spaces at the beginning make no difference. But if you want to remove spaces at right end `i = i.rstrip()`, on left end `i = i.lstrip()`, on both ends `i = i.strip()` and later use `rsplit()` - or both `i = i.strip().rsplit(" ", 1)` – furas Oct 15 '19 at 18:13
2

Or using a list comprension: `[x.strip().rsplit(maxsplit=1) for x in d]` – Alexander Oct 15 '19 at 18:17
@Alexander good point - I added your list comprension to answer. – furas Oct 15 '19 at 18:31

How to re.split(" ", string) when there are multiple spaces?

1 Answers1