16

I have a doubt on how to use the split function.

str = 'James;Joseph;Arun;'
str.split(';')

I got the result ['James', 'Joseph', 'Arun', '']

I need the output as ['James', 'Joseph', 'Arun']

What is the best way to do it?

Mel
  • 5,837
  • 10
  • 37
  • 42
Jisson
  • 3,566
  • 8
  • 38
  • 71

2 Answers2

27

To remove all empty strings you can use a list comprehension:

>>> [x for x in my_str.split(';') if x]

Or the filter/bool trick:

>>> filter(bool, my_str.split(';'))

Note that this will also remove empty strings at the start or in the middle of the list, not just at the end.

If you just want to remove the empty string at the end you can use rstrip before splitting.

>>> my_str.rstrip(';').split(';')
Community
  • 1
  • 1
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • 3
    +1 Hadn't heard about `filer(bool,x)` until now, only `filter(None,x)`. Which is better in your opinion? – jamylak May 28 '12 at 06:49
  • 1
    @jamylak: Both are fine. I prefer `filter(bool, x)` because it makes it more obvious why it works. Using `None` as a filtering function seems like magic (unless you have read the documentation to find out why it works). But others prefer `filter(None, x)` so I guess it doesn't make much difference. – Mark Byers May 28 '12 at 06:52
18

First remove ; from the right edge of the string:

s.rstrip(';').split(';')

You can also use filter() (which will filter off also empty elements that weren't found at the end of the string). But the above is really the cleanest approach in my opinion, when you want to avoid empty element at the end, resulting from ";" characters occuring at the end of the string.

EDIT: Actually more accurate than the above (where the above is still more accurate than using filter()) is the following approach:

(s[:-1] if s.endswith(';') else s).split(';')

This will remove only the last element, and only if it would be created empty.

Testing all three solutions you will see, that they give different results:

>>> def test_solution(solution):
    cases = [
        'James;Joseph;Arun;',
        'James;;Arun',
        'James;Joseph;Arun',
        ';James;Joseph;Arun',
        'James;Joseph;;;',
        ';;;',
        ]
    for case in cases:
        print '%r => %r' % (case, solution(case))

>>> test_solution(lambda s: s.split(';'))  # original solution
'James;Joseph;Arun;' => ['James', 'Joseph', 'Arun', '']
'James;;Arun' => ['James', '', 'Arun']
'James;Joseph;Arun' => ['James', 'Joseph', 'Arun']
';James;Joseph;Arun' => ['', 'James', 'Joseph', 'Arun']
'James;Joseph;;;' => ['James', 'Joseph', '', '', '']
';;;' => ['', '', '', '']
>>> test_solution(lambda s: filter(bool, s.split(';')))
'James;Joseph;Arun;' => ['James', 'Joseph', 'Arun']
'James;;Arun' => ['James', 'Arun']
'James;Joseph;Arun' => ['James', 'Joseph', 'Arun']
';James;Joseph;Arun' => ['James', 'Joseph', 'Arun']
'James;Joseph;;;' => ['James', 'Joseph']
';;;' => []
>>> test_solution(lambda s: s.rstrip(';').split(';'))
'James;Joseph;Arun;' => ['James', 'Joseph', 'Arun']
'James;;Arun' => ['James', '', 'Arun']
'James;Joseph;Arun' => ['James', 'Joseph', 'Arun']
';James;Joseph;Arun' => ['', 'James', 'Joseph', 'Arun']
'James;Joseph;;;' => ['James', 'Joseph']
';;;' => ['']
>>> test_solution(lambda s: (s[:-1] if s.endswith(';') else s).split(';'))
'James;Joseph;Arun;' => ['James', 'Joseph', 'Arun']
'James;;Arun' => ['James', '', 'Arun']
'James;Joseph;Arun' => ['James', 'Joseph', 'Arun']
';James;Joseph;Arun' => ['', 'James', 'Joseph', 'Arun']
'James;Joseph;;;' => ['James', 'Joseph', '', '']
';;;' => ['', '', '']
Tadeck
  • 132,510
  • 28
  • 152
  • 198
  • 2
    IMO change it to `rstrip` since he said the **last** empty space. – jamylak May 28 '12 at 06:51
  • 1
    @jamylak: correct, I have been adding that information when you were writing your comment. Please see the updated answer. – Tadeck May 28 '12 at 06:56