16

I want to transform a string such as following:

'   1   ,   2  ,    ,   ,   3   '

into a list of non-empty elements:

['1', '2', '3']

My solution is this list comprehension:

print [el.strip() for el in mystring.split(",") if el.strip()]

Just wonder, is there a nice, pythonic way to write this comprehension without calling el.strip() twice?

peter.slizik
  • 2,015
  • 1
  • 17
  • 29
  • Where did the string come from? How was it created? – Chris_Rands Oct 31 '17 at 10:28
  • 1
    Duplicate of [1](https://stackoverflow.com/q/26672532/2301450) [2](https://stackoverflow.com/q/40539357/2301450) [3](https://stackoverflow.com/q/41112035/2301450) [4](https://stackoverflow.com/q/15812779/2301450). Just enter the title of this question into Google search. – vaultah Oct 31 '17 at 15:50

7 Answers7

22

You can use a generator inside the list comprehension:

  [x for x in (el.strip() for el in mylist.split(",")) if x]
#             \__________________ ___________________/
#                                v
#                        internal generator

The generator thus will provide stripped elements, and we iterate over the generator, and only check the truthiness. We thus save on el.strip() calls.

You can also use map(..) for this (making it more functional):

  [x for x in map(str.strip, mylist.split(",")) if x]
#             \______________ ________________/
#                            v
#                           map

But this is basically the same (although the logic of the generator is - in my opinion - better encapsulated).

heemayl
  • 39,294
  • 7
  • 70
  • 76
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
8

As a simple alternative to get a list of non-empty elements (in addition to previous good answers):

import re

s = '   1   ,   2  ,    ,   ,   3   '
print(re.findall(r'[^\s,]+', s))

The output:

['1', '2', '3']
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
5

How about some regex to extract all the numbers from the string

import re

a = '   1   ,   2  ,    ,   ,   3   '
print(re.findall(r'\d+', a))

Output:

['1', '2', '3']
Miraj50
  • 4,257
  • 1
  • 21
  • 34
3

In just one line of code that's about as terse you're going to get. Ofcourse, if you want to get fanciful you can try the functional approach:

filter(lambda x: x, map(lambda x: x.strip(), mylist.split(',')))

But this gets you terseness in exchange for visibility

omu_negru
  • 4,642
  • 4
  • 27
  • 38
  • the `map` i find a bit over the top. You can just put OP's comprehension in there without the `if` – Ma0 Oct 31 '17 at 10:32
  • 1
    i figured i'd go functional all the way and not 'spoil' it :) – omu_negru Oct 31 '17 at 10:33
  • 6
    `filter(None,map(str.strip, mystring.split(",")))` would be a lot less ugly if you insist on the functional apporach – Chris_Rands Oct 31 '17 at 10:38
  • didn't know about the None part.... but the str.strip should have been obvious. thanks :) – omu_negru Oct 31 '17 at 10:39
  • I like this solution. However, I decided to accept Willem's answer, would like to stick with generators... – peter.slizik Oct 31 '17 at 11:12
  • his solution is more readable too... but AFAIK both map and filter use generators underneath – omu_negru Oct 31 '17 at 11:20
  • @omu_negru: In Python-3.x, these are indeed generators. In Python-2.x `filter` and `map` produce lists, and this can unfortunately result in memory problems, since in that case more items are put into memory. – Willem Van Onsem Oct 31 '17 at 12:29
2

Go full functional with map and filter by using:

s = '   1   ,   2  ,    ,   ,   3   '
res = filter(None, map(str.strip, s.split(',')))

though similar to @omu_negru's answer, this avoids using lambdas which are arguably pretty ugly but, also, slow things down.

The argument None to filter translates to: filter on truthness, essentially x for x in iterable if x, while the map just maps the method str.strip (which has a default split value of whitespace) to the iterable obtained from s.split(',').

On Python 2, where filter still returns a list, this approach should easily edge out the other approaches in speed.


In Python 3 one would have to use:

res = [*filter(None, map(str.strip, s.split(',')))]

in order to get the list back.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
-1

If you have imported "re", then re.split() will work:

import re
s='   1   ,   2  ,    ,   ,   3   '
print ([el for el in re.split(r"[, ]+",s) if el])
['1', '2', '3']

If strings separated by only spaces (with no intervening comma) should not be separated, then this will work:

import re
s=' ,,,,,     ,,,,  1   ,   2  ,    ,   ,   3,,,,,4   5, 6   '
print ([el for el in re.split(r"\s*,\s*",s.strip()) if el])
['1', '2', '3', '4   5', '6']
Prem
  • 460
  • 1
  • 7
  • 15
  • But this will also remove the space in between an element. For instance `' a b, qux foo, 1 2 3'`. Will result in `'a', 'b', 'qux', 'foo', '1', '2', '3'`. – Willem Van Onsem Oct 31 '17 at 14:50
  • @WillemVanOnsem , It is not clear, but OP wants only numbers, not alphanumeric strings. Check the answer by RomanPerekhrest which uses regex r'[^\s,]+' (6 upvotes with no negative comments, but it implicitly has the same problem you mention) ; Check the answer by Miraj50 which says "extract all the numbers from the string" (3 upvotes with no negative comments, but it explicitly has the same problem you mention) ; I know what you are saying and I have specifically included "4 5" in my answer. I wonder why my answer is so bad or wrong to get only downvotes .... – Prem Oct 31 '17 at 16:10
  • I did not downvote. I agree the same comment applies to RomanPerekhrest. Nevertheless usually one favors generic robust and flexible methods over methods that can only fix the exact problem: if later the OP changes his mind a bit, then this solution might stop working. – Willem Van Onsem Oct 31 '17 at 16:12
  • @WillemVanOnsem , thanks for the feedback. I have added a minor edit to my answer. I have upvoted your correct answer. [[ If OP later changes his mind a bit, then he will also have to change the code a bit, whether he uses your snippet or mine or something else ]] – Prem Oct 31 '17 at 16:38
-1

List comprehensions are wonderful, but it's not illegal to use more than one line of code! You could even - heaven forbid - use a for loop!

result = []
for el in mystring.split(",")
    x = el.strip()
    if x:
        result.append(x)

Here's a two-line version. It's actually the same as the accepted answer by Willem Van Onsem, but with a name given to a subexpression (and a generator changed to a list but it makes essentially no difference for a problem this small). In my view, this makes it a lot easier to read, despite taking fractionally more code.

all_terms = [el.strip() for el in mystring.split(",")]
non_empty_terms = [x for x in all_terms if x]

Some of the other answers are certainly shorter, but I'm not convinced any of them are simpler/easier to understand. Actually, I think the best answer is just the one in your question, because the repetition in this case is quite minor.

Arthur Tacca
  • 8,833
  • 2
  • 31
  • 49
  • 1
    I'd appreciate feedback on the downvotes. Was it the sarcasm (admittedly unnecessary) or are voters really that against splitting code into multiple statements for clarity? – Arthur Tacca Oct 31 '17 at 16:10