I need to split a string. I am using this:
def ParseStringFile(string):
p = re.compile('\W+')
result = p.split(string)
But I have an error: my result has two empty strings (''), one before 'Лев'. How do I get rid of them?
I need to split a string. I am using this:
def ParseStringFile(string):
p = re.compile('\W+')
result = p.split(string)
But I have an error: my result has two empty strings (''), one before 'Лев'. How do I get rid of them?
As nhahtdh pointed out, the empty string is expected since there's a \n
at the start and end of the string, but if they bother you, you can filter
them very quickly and efficiently.
>>> filter(None, ['', 'text', 'more text', ''])
['text', 'more text']
filter
usually takes a callable function as first argument and creates a list with all elements removed for which function(element)
returns False
. Here None
is given, which triggers a special case: The element is removed if bool(element)
is false. As bool('')
is false, it gets removed.
Also see the manual.
You could remove all newlines from the string before matching it:
p.split(string.strip('\n'))
Alternatively, split the string and then remove the first and last element:
result = p.split(string)[1:-1]
The [1:-1]
takes a copy of the result and includes all indexes starting at 1 (i.e. removing the first element), and ending at -2 (i.e. the second to last element. The second index is exclusive)
A longer and less elegant alternative would be to modify the list in-place:
result = p.split(string)
del result[-1] # remove last element
del result[0] # remove first element
Note that in these two solutions the first and last element must be the empty string. If sometimes the input doesn't contain these empty strings at the beginning or end, then they will misbehave. However they are also the fastest solutions.
If you want to remove all empty strings in the result, even if they happen inside the list of results you can use a list-comprehension:
[word for word in p.split(string) if word]