Is there a better way to use strip() on a list of strings? - python

Question

For now i've been trying to perform strip() on a list of strings and i did this:

i = 0
for j in alist:
    alist[i] = j.strip()
    i+=1

Is there a better way of doing that?

Upvoting for random anonymous uncommented downvote. If there is something wrong with the question, it's utterly meaningless to downvote without telling the author what. — KRyan, Aug 29 '12 at 16:52
If you want to iterate using indices, do `for (i, value) in enumerate(alist)` — Kos, Aug 29 '12 at 17:11
I've added a benchmark which compares some options described here. — Kos, Aug 29 '12 at 17:26

score 40 · Accepted Answer · answered Aug 29 '12 at 16:50

40

You probably shouldn't be using list as a variable name since it's a type. Regardless:

list = map(str.strip, list)

This will apply the function str.strip to every element in list, return a new list, and store the result back in list.

answered Aug 29 '12 at 16:50

eduffy

39,140
13
95
92

1

+1 that's the way. And if you want to alter the same list instance instead of binding the variable to a new one (say, not to break other references to this list), use the slice syntax like @kojiro said – Kos Aug 29 '12 at 16:55
2

An example where `map` is an excellent choice. (`itertools.imap` might or might not be better, of course, as for example when assigning to a slice). – Marcin Aug 29 '12 at 16:55
@Kos In that case, an iterator-based solution would be even better (as it avoids creating a whole list which is then unreferenced and awaiting garbage collection). – Marcin Aug 29 '12 at 16:57
no worries, memory shouldn't be a problem since i'm reading a file, searching a string and dumping it away once i've found the index of a string. =) – alvas Aug 29 '12 at 17:03
Instead of using map and storing the data in the list again, itertools.imap is better in case of python 2.x. In python 3.x map will return iter. – shantanoo Sep 30 '13 at 20:04

score 19 · Answer 2 · edited Aug 29 '12 at 17:41

19

You could use list comprehensions

stripped_list = [j.strip() for j in initial_list]

edited Aug 29 '12 at 17:41

georg

211,518
52
313
390

answered Aug 29 '12 at 16:52

karthikr

97,368
26
197
188

Do you think list comprehensions make code work faster?? or just smaller?? – Surya Aug 29 '12 at 17:01
List comprehensions are very efficient for iterable object with simple rules. You may use maps and list comprehensions depending on the complexity. But yes, they do provide a quick and efficient implementation – karthikr Aug 29 '12 at 17:05

score 10 · Answer 3 · answered Aug 29 '12 at 17:25

Some intriguing discussions on performance happened here, so let me provide a benchmark:

http://ideone.com/ldId8

noslice_map              : 0.0814900398254
slice_map                : 0.084676027298
noslice_comprehension    : 0.0927240848541
slice_comprehension      : 0.124806165695
iter_manual              : 0.133514881134
iter_enumerate           : 0.142778873444
iter_range               : 0.160353899002

So:

map(str.strip, my_list) is the fastest way, it's just a little bit faster than comperhensions.
- Use map or itertools.imap if there's a single function that you want to apply (like str.split)
- Use comprehensions if there's a more complicated expression
Manual iteration is the slowest way; a reasonable explanation is that it requires the interpreter to do more work and the efficient C runtime does less
Go ahead and assign the result like my_list[:] = map..., the slice notation introduces only a small overhead and is likely to spare you some bugs if there are multiple references to that list.
- Know the difference between mutating a list and re-creating it.

Do you mean `my_list = map(str.strip, list[:])`? 'Cause the other way gives me a NameError. — Izkata, Aug 29 '12 at 18:17
I mean `my_list[:] = map(str.strip, my_list)`. See the code under the link. — Kos, Aug 30 '12 at 05:23

kojiro · Answer 4 · 2012-08-29T18:56:09.887

3

I think you mean

a_list = [s.strip() for s in a_list]

Using a generator expression may be a better approach, like this:

stripped_list = (s.strip() for s in a_list)

offers the benefit of lazy evaluation, so the strip only runs when the given element, stripped, is needed.

If you need references to the list to remain intact outside the current scope, you might want to use list slice syntax.:

a_list[:] = [s.strip() for s in a_list]

For commenters interested in the speed of various approaches, it looks as if in CPython the generator-to-slice approach is the least efficient:

>>> from timeit import timeit as t
>>> t("""a[:]=(s.strip() for s in a)""", """a=[" %d " % s for s in range(10)]""")
4.35184121131897
>>> t("""a[:]=[s.strip() for s in a]""", """a=[" %d " % s for s in range(10)]""")
2.9129951000213623
>>> t("""a=[s.strip() for s in a]""", """a=[" %d " % s for s in range(10)]""")
2.47947096824646

edited Aug 29 '12 at 18:56

answered Aug 29 '12 at 16:51

kojiro

74,557
19
143
201

Why say "supposedly slightly more efficient" instead of profiling and checking? And BTW `[:]` is useful because then it alters the same list, not re-assigns the variable to a new list. – Kos Aug 29 '12 at 16:54
2

It's *less* efficient because it has to copy N items instead of replacing the reference to the list. The only "advantage", which you may not need or want, is that the change is visible to anyone who has another reference to the original list object. – Aug 29 '12 at 16:54
imho, that's unpythonic. – Sean W. Aug 29 '12 at 16:57
I've changed this to a generator expression, as it's vastly more appropriate. – Marcin Aug 29 '12 at 16:58
2

@Marcin it might be a more appropriate *approach*, but it's an incorrect answer to the question asked. I edited the question to describe both options. – kojiro Aug 29 '12 at 17:02
@kojiro If you are assigning to a slice, a generator is more appropriate. You have edited your question to eliminate slice assignment. – Marcin Aug 29 '12 at 18:45
@Marcin how is it more appropriate? I've added timeits and it doesn't seem to be as efficient. (I'm not equating efficiency and appropriateness, but I genuinely don't know why it would be more appropriate in the absence of efficiency.) – kojiro Aug 29 '12 at 18:57
@kojiro You'll likely see better efficiency for larger lists, as less memory allocation will occur; secondly in real usage it is likely to lead to better overall performance, as there will be less in the way of garbage collectible, but uncollected objects hanging around. – Marcin Aug 29 '12 at 19:01
Also, it's generally nicer for everybody else if you don't copy the interpreter prompts – Marcin Aug 29 '12 at 19:03
@Marcin if I don't copy the interpreter prompts, how can you tell the difference between a command and its output? – kojiro Aug 29 '12 at 19:22
@kojiro I usually comment the output like so `# => `, but a simple comment will suffice. – Marcin Aug 29 '12 at 19:31

Is there a better way to use strip() on a list of strings? - python

4 Answers4

Linked