11

I'd like to split a string using one or more separator characters.

E.g. "a b.c", split on " " and "." would give the list ["a", "b", "c"].

At the moment, I can't see anything in the standard library to do this, and my own attempts are a bit clumsy. E.g.

def my_split(string, split_chars):
    if isinstance(string_L, basestring):
        string_L = [string_L]
    try:
        split_char = split_chars[0]
    except IndexError:
        return string_L

    res = []
    for s in string_L:
        res.extend(s.split(split_char))
    return my_split(res, split_chars[1:])

print my_split("a b.c", [' ', '.'])

Horrible! Any better suggestions?

James Brady
  • 27,032
  • 8
  • 51
  • 59

4 Answers4

38
>>> import re
>>> re.split('[ .]', 'a b.c')
['a', 'b', 'c']
Zack Zatkin-Gold
  • 814
  • 9
  • 29
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • And remember, that characters have to be in squere brackets []. I forgot about that and lost at least 20 minutes. With out brackets `split()` splits acording to whole string. – noisy Mar 16 '13 at 00:35
2

Solution without re:

from itertools import groupby
sep = ' .,'
s = 'a b.c,d'
print [''.join(g) for k, g in groupby(s, sep.__contains__) if not k]

An explanation is here https://stackoverflow.com/a/19211729/2468006

Community
  • 1
  • 1
monitorius
  • 3,566
  • 1
  • 20
  • 17
2

This one replaces all of the separators with the first separator in the list, and then "splits" using that character.

def split(string, divs):
    for d in divs[1:]:
        string = string.replace(d, divs[0])
    return string.split(divs[0])

output:

>>> split("a b.c", " .")
['a', 'b', 'c']

>>> split("a b.c", ".")
['a b', 'c']

I do like that 're' solution though.

nakedfanatic
  • 3,108
  • 2
  • 28
  • 33
1

Not very fast but does the job:

def my_split(text, seps):
  for sep in seps:
    text = text.replace(sep, seps[0])
  return text.split(seps[0])
yairchu
  • 23,680
  • 7
  • 69
  • 109