28

I have a concatenated string like this:

my_str = 'str1;str2;str3;'

and I would like to apply split function to it and then convert the resulted list to a tuple, and get rid of any empty string resulted from the split (notice the last ';' in the end)

So far, I am doing this:

tuple(filter(None, my_str.split(';')))

Is there any more efficient (in terms of speed and space) way to do it?

MLister
  • 10,022
  • 18
  • 64
  • 92
  • @Levon, point taken, sorry, i just picked an example variable name in a hurry. thanks. – MLister Jun 12 '12 at 16:59
  • 1
    Please explain what exactly you mean by "better". – NPE Jun 12 '12 at 16:59
  • 1. Can empty segments only occur due to an additional `;` at the end, or might there be empty strings in the middle of the list? 2. Why do you want to convert the result to a tuple? Usually, simply using the list returned by `str.split()` should do fine. – Sven Marnach Jun 12 '12 at 17:00
  • @robert, what is the faster way? – MLister Jun 12 '12 at 17:00
  • @aix, sure, just amend the question, please see the update above. – MLister Jun 12 '12 at 17:01
  • @SvenMarnach, because I am using the tuple as a key in a dictionary later. – MLister Jun 12 '12 at 17:02
  • 4
    Depending on what datasets you're applying this to, it's entirely likely that you'll spend more time on this question -- writing it, reading the replies, and so on -- than you will save in runtime on the difference between the fastest and the slowest methods. This is true of at least half of the "fastest way" Python questions that get asked 'round here. – DSM Jun 12 '12 at 17:04
  • @Levon, the actual use case is I have around 200 of them to process in any given request to a server, if that makes any difference. – MLister Jun 12 '12 at 17:05
  • @MLister Thanks for the additional information. I suppose you could benchmark the various approaches and pick one you like best in terms of speed (if there's a noticeable difference) and readability. – Levon Jun 12 '12 at 17:13

8 Answers8

20

How about this?

tuple(my_str.split(';')[:-1])
('str1', 'str2', 'str3')

You split the string at the ; character, and pass all off the substrings (except the last one, the empty string) to tuple to create the result tuple.

xxx
  • 1,153
  • 1
  • 11
  • 23
Levon
  • 138,105
  • 33
  • 200
  • 191
13

That is a very reasonable way to do it. Some alternatives:

  • foo.strip(";").split(";") (if there won't be any empty slices inside the string)
  • [ x.strip() for x in foo.split(";") if x.strip() ] (to strip whitespace from each slice)

The "fastest" way to do this will depend on a lot of things… but you can easily experiment with ipython's %timeit:

In [1]: foo = "1;2;3;4;"

In [2]: %timeit foo.strip(";").split(";")
1000000 loops, best of 3: 1.03 us per loop

In [3]: %timeit filter(None, foo.split(';'))
1000000 loops, best of 3: 1.55 us per loop
David Wolever
  • 148,955
  • 89
  • 346
  • 502
4

If you only expect an empty string at the end, you can do:

a = 'str1;str2;str3;'
tuple(a.split(';')[:-1])

or

tuple(a[:-1].split(';'))
exfizik
  • 5,351
  • 4
  • 23
  • 26
3

Try tuple(my_str.split(';')[:-1])

googler
  • 104
  • 2
2

Yes, that is quite a Pythonic way to do it. If you have a love for generator expressions, you could also replace the filter() with:

tuple(part for part in my_str.split(';') if part)

This has the benefit of allowing further processing on each part in-line.

It's interesting to note that the documentation for str.split() says:

... If sep is not specified or is None, any whitespace string is a separator and empty strings are removed from the result.

I wonder why this special case was done, without allowing it for other separators...

voithos
  • 68,482
  • 12
  • 101
  • 116
1

use split and then slicing:

 my_str.split(';')[:-1]

or :

lis=[x for x in my_str.split(';') if x]
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
1

if number of items in your string is fixed, you could also de-structure inline like this:

(str1, str2, str3) = my_str.split(";")

more on that here: https://blog.teclado.com/destructuring-in-python/

Sonic Soul
  • 23,855
  • 37
  • 130
  • 196
0

I know this is an old question, but I just came upon this and saw that the top answer (David) doesn't return a tuple like OP requested. Although the solution works for the one example OP gave, the highest voted answer (Levon) strips the trailing semicolon with a substring, which would error on an empty string.

The most robust and pythonic solution is voithos' answer:

tuple(part for part in my_str.split(';') if part) 

Here's my solution:

tuple(my_str.strip(';').split(';'))

It returns this when run against an empty string though:

('',)

So I'll be replacing mine with voithos' answer. Thanks voithos!