I wrote a function to perform better than split() built in function (I know it's not idiomatic python, but I gave my best), so when I pass this argument:
better_split("After the flood ... all the colors came out."," .")
I'd expected this outcome:
['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out']
However, surprisingly, the function causes an incomprehensible (to me) behavior. When it reaches the last two words, it does not suppress the more '' and, rather than add to the outcome list "cam" and "out", adds to it "came out" and, so, I got this:
['After', 'the', 'flood', 'all', 'the', 'colors', 'came out']
Does someone with more experience understand why this happens? Thank you in advance for any help!
def better_split(text,markersString):
markers = []
splited = []
for e in markersString:
markers.append(e)
for character in text:
if character in markers:
point = text.find(character)
if text[:point] not in character:
word = text[:point]
splited.append(word)
while text[point] in markers and point+1 < len(text):
point = point + 1
text = text[point:]
print 'final splited = ', splited
better_split("This is a test-of the,string separation-code!", " ,!-")
better_split("After the flood ... all the colors came out."," .")
split() WITH MULTIPLE SEPARATIONS If you are looking for split() with multiple separations, see: Split Strings with Multiple Delimiters?
The best answer without import re that I found was this:
def my_split(s, seps):
res = [s]
for sep in seps:
s, res = res, []
for seq in s:
res += seq.split(sep)
return res