255

Now I know that it is not safe to modify the list during an iterative looping. However, suppose I have a list of strings, and I want to strip the strings themselves. Does replacement of mutable values count as modification?


See Scope of python variable in for loop for a related problem: assigning to the iteration variable does not modify the underlying sequence, and also does not impact future iteration.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
alexgolec
  • 26,898
  • 33
  • 107
  • 159

10 Answers10

241

Since the loop below only modifies elements already seen, it would be considered acceptable:

a = ['a',' b', 'c ', ' d ']

for i, s in enumerate(a):
    a[i] = s.strip()

print(a) # -> ['a', 'b', 'c', 'd']

Which is different from:

a[:] = [s.strip() for s in a]

in that it doesn't require the creation of a temporary list and an assignment of it to replace the original, although it does require more indexing operations.

Caution: Although you can modify entries this way, you can't change the number of items in the list without risking the chance of encountering problems.

Here's an example of what I mean—deleting an entry messes-up the indexing from that point on:

b = ['a', ' b', 'c ', ' d ']

for i, s in enumerate(b):
    if s.strip() != b[i]:  # leading or trailing whitespace?
        del b[i]

print(b)  # -> ['a', 'c ']  # WRONG!

(The result is wrong because it didn't delete all the items it should have.)

Update

Since this is a fairly popular answer, here's how to effectively delete entries "in-place" (even though that's not exactly the question):

b = ['a',' b', 'c ', ' d ']

b[:] = [entry for entry in b if entry.strip() == entry]

print(b)  # -> ['a']  # CORRECT

See How to remove items from a list while iterating?.

martineau
  • 119,623
  • 25
  • 170
  • 301
  • 4
    Why does Python only make a copy of the individual element in the syntax `for i in a` though? This is very counterintuitive, seemingly different from other languages and has resulted in errors in my code that I had to debug for a long period of time. Python Tutorial doesn't even mention it. Though there must be some reason to it? – xji Jan 29 '17 at 17:25
  • 3
    @JIXiang: It doesn't make a copies. It just assigns the loop variable name to successive elements or value of the thing being iterated-upon. – martineau Jan 29 '17 at 19:39
  • 1
    Eww, why use two names (`a[i]` and `s`) for the same object in the same line when you don't have to? I'd much rather do `a[i] = a[i].strip()`. – Navin Sep 05 '17 at 18:25
  • 6
    @Navin: Because `a[i] = s.strip()` only does one indexing operation. – martineau Sep 05 '17 at 19:03
  • 1
    @martineau `enumerate(b)` does an indexing operation on every iteration and you're doing another one with `a[i] =`. AFAIK it is impossible to implement this loop in Python with only 1 indexing operation per loop iteration :( – Navin Sep 05 '17 at 19:37
  • 2
    @Navin: Using `enumerate()` doesn't add an indexing operation. However, regardless of whether it does or not, the total number of them performed per iteration is obviously less via `a[i] = s.strip()` than `a[i] = a[i].strip()`. – martineau Nov 16 '18 at 11:49
  • "deleting an entry messes-up the indexing from that point on" - what is this concept called? I want to read more about it. Is the list checked at every iteration or just loaded once at beginning of loop? – variable Nov 08 '19 at 17:41
  • 2
    @variable: Don't know of a specific name for the concept. The problem's related to how lists are stored and iterated over internally (which isn't documented and might vary in different versions). It seems very logical to me that the operation could get "messed-up" — i.e. not be done correctly — if the thing that's being iterated is changed while iterating over it. It also depends on what the modification is as well as what type of elements are in the list. See [Modify a list while iterating](https://stackoverflow.com/questions/44864393/modify-a-list-while-iterating) for more information. – martineau Nov 08 '19 at 18:36
  • Thabks for taking time to give response. I particularly liked this exmaple which helped me get the concept right. https://stackoverflow.com/questions/13939341/why-does-a-for-loop-with-pop-method-or-del-statement-not-iterate-over-all-list – variable Nov 09 '19 at 03:31
  • @variable: Yes, that question has some good answers. However the explanations are all likely assuming the list position "cursor" is implement is a certain way (to illustrate the point). However, it could, in theory, be changed someday to a way that avoids the issue — though I doubt that'll ever happen. – martineau Nov 09 '19 at 03:43
  • 1
    Note: counter-intuitively even this generator expression variant: `a[:] = (s.strip() for s in a)` first creates a temporary list and only then it assigns its elements to the original list. :( – pabouk - Ukraine stay strong Feb 12 '21 at 14:04
  • @pabouk: What is the basis of your claim? Regardless, that wasn't the subject of this question. – martineau Feb 19 '21 at 00:27
  • 1
    @martineau It is explained for example here: https://stackoverflow.com/a/53286694/320437 and here is the part of CPython 3.9 implementation: https://github.com/python/cpython/blob/2c0a0b04a42dc4965fcfaef936f497e44f06dea5/Objects/listobject.c#L630 --- Another explanation: https://stackoverflow.com/a/11877248/320437 --- I think the reason for this behaviour is that taking care of all possible bad cases for in-place list modification using an arbitrary generator expression is (almost) impossible. --- IMHO the question is about a list in-place modification so I think it concerns the subject. – pabouk - Ukraine stay strong Mar 07 '21 at 15:54
  • @pabouk: IMO doing `a[:] = ` isn't modifying the list in-place, it's replacing it "in-place" which isn't the same thing. – martineau Mar 07 '21 at 17:42
  • 1
    @martineau `a[:] = iterable` is a special case of `a[start:stop:step] = iterable` which is not a list replacement in general. It could be a point of view if you call `a[:] = ` a replacement or modification. I feel modification is a more precise description of the operation. In all the cases the object `a` stays (check `id(a)`), only its sliced elements are replaced/removed/extended. – pabouk - Ukraine stay strong Mar 08 '21 at 11:26
  • I agree 100% with @Navin in the sense that it's weird that the language supports the syntax on the right side but not the left side. Python clearly knows what `s` is in `a[i] = s.strip()`, so why doesn't it just support `s = s.strip()`? – Mike B Aug 05 '21 at 01:06
  • 1
    @mblakesley: Because `s` is merely a temporary *name* referring to one of the list's string elements, and because strings are immutable, calling its `strip()` method returns a modified [**copy**](https://docs.python.org/3/library/stdtypes.html#str.strip) of its current value. The significance of which is that what a `s = s.strip()` statement does is assign the name `s` to this new value — all of which has no affect on the list element itself. I suggest you read [Facts and myths about Python names and values](https://nedbatchelder.com/text/names.html). – martineau Aug 05 '21 at 01:52
  • @martineau "all of which has no affect on the list element itself". But that's not true! For example, if we're talking about a list of dicts, you can modify them in place: `for d in d_list: d["x"] = "modified"`. So you can modify them, but you can't replace them, like `for d in d_list: d = {"x": "modified"}`. Why doesn't the 2nd example change the underlying list? – Mike B Aug 06 '21 at 01:39
  • @mblakesley: The difference is because dictionaries are mutable and strings are not. `d["x"] = "modified"` changes the dictionary `d` refers to in-place, but `s = s.strip()` does not change the string `s` refers to because it can't. Your 2nd example doesn't (and never will) work because all it's doing is assigning a completely different value to the name `d` (so the mutability of its current value doesn't matter). If you still don't understand the distinction or why it matters, post a question on the topic because I won't be discussing it further here. – martineau Aug 06 '21 at 08:56
  • @martineau Ah, I see now. `s` is a new variable that's set to `a[i]`. That may sound obvious, but personally, I expected `s` to actually *be* `a[i]` (thru interpreter magic), rather than being its own thing that's merely referring to `a[i]`. Makes sense. Thanks! – Mike B Aug 09 '21 at 17:59
  • @mblakesley: Although I indicated I didn't want to discuss this further, your last comment simply compels me to say last thing: **You still don't get it.** – martineau Aug 09 '21 at 18:43
175

It's considered poor form. Use a list comprehension instead, with slice assignment if you need to retain existing references to the list.

a = [1, 3, 5]
b = a
a[:] = [x + 2 for x in a]
print(b)
Jemshit
  • 9,501
  • 5
  • 69
  • 106
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 18
    The slice assignment is clever and avoids modifying the original during the loop, but requires the creation of a temporary list the length of the original. – martineau Nov 02 '10 at 22:45
  • 11
    @Vigrond: So when the `print b` statement is executed, you can tell if `a` was modified in-place rather than replaced. Another possibility would have been a `print b is a` to see if they both still refer to the same object. – martineau Jun 18 '13 at 21:27
  • 1
    lovely solution and python-like! – loretoparisi Apr 05 '16 at 15:23
  • 17
    why a[:] = and not just a = ? – kdubs Mar 16 '17 at 11:40
  • 15
    @kdubs: "...with slice assignment if you need to retain existing references to the list." – Ignacio Vazquez-Abrams Mar 16 '17 at 12:12
  • @IgnacioVazquez-Abrams What if we require more than one line of code to express the loop? – cristiprg Dec 07 '19 at 21:46
  • @cristiprg lambda expr. – LIU Qingyuan Oct 16 '20 at 03:01
  • If you want to modify the original list but don't want to create a temporary, you can assign from a generator expression rather than a list comprehension (using parentheses rather than square brackets, like `a[:] = (x + 2 for x in a)`). This is generally a bit slower, though. It also means that you're *effectively* doing the same thing as the "poor form" `for` loop. – Karl Knechtel Aug 11 '22 at 06:25
24

One more for loop variant, looks cleaner to me than one with enumerate():

for idx in range(len(list)):
    list[idx]=... # set a new value
    # some other code which doesn't let you use a list comprehension
Kenly
  • 24,317
  • 7
  • 44
  • 60
Eugene Shatsky
  • 401
  • 4
  • 8
  • 33
    Many consider using something like `range(len(list))` in Python a code smell. – martineau Aug 03 '14 at 13:55
  • 1
    @martineau disagree, `range` is more clean than [`enumerate`](https://docs.python.org/2/library/functions.html#enumerate). as in the background `enumerate` is the generator, which create list of tuples(index, value), which could be considered more slower than than just iteration through the list. – Reishin May 05 '15 at 03:42
  • 3
    @Reishin: Since `enumerate` is a generator it's not creating a list.of tuples, it creates them one at a time as it iterates through the list. The only way to tell which is slower would be to `timeit`. – martineau May 05 '15 at 06:35
  • 4
    @martineau [code](http://pastebin.com/p42m0PRK) could be not well pretty, but according to `timeit` `enumerate` is slower – Reishin May 05 '15 at 07:01
  • 2
    @Reishin: Your benchmarking code isn't completely valid because it doesn't take into account the need to retrieve the value in the list at the given index - which isn't shown in this answer either. – martineau Jul 18 '15 at 20:21
  • @martineau what? If i'm not need to change the value? We comparing here loop and iterating, not set operation – Reishin Jul 19 '15 at 13:54
  • 4
    @Reishin: Your comparison is invalid precisely for that reason. It's measuring the looping overhead in isolation. To be conclusive the time it takes the entire loop to execute must be measured because of the possibility that any overhead differences might be mitigated by the benefits provided to the code inside the loop of looping a certain way — otherwise you're not comparing apples to apples. – martineau Jul 19 '15 at 16:06
  • @martineau I wouldn't evaluate the code on speed, I'd do it on clarity. That depends more on the context, and in this case I think `range` might have an edge. – Mark Ransom Jun 22 '22 at 22:09
  • @Reishin: Python **reuses** small tuple objects. `enumerate()` puts two elements into a tuple that already exists, and the `for i, s in ...` loop then unpacks that tuple to two local variables, freeing the tuple to be reused in the next iteration. It's all very efficient and fast, plus you already have the referenced value, where you need a separate indexing statement when using `range()`. `enumerate()` is absolutely the more pythonic way of doing this, as well as the more efficient method. – Martijn Pieters Aug 19 '22 at 10:37
  • @MartijnPieters wrong, `enumerate` is just a generator which on the fly creating an index and yields tuple back (from the C code). The thing is that returned value from enumerate is a copied element (as python doing it by value in this case), eve if you don't need value (even big sized one), you will get an copy of it in the memory. While with range you can control this behavior. – Reishin Aug 19 '22 at 21:16
  • @Reishin no copies are made. Python objects are **always** passed by reference. If the goal is to have access to both the value and the index, `enumerate()` is hands-down the most efficient. I’m not sure where you have the idea from that the value is being copied, by the way. – Martijn Pieters Aug 19 '22 at 23:36
17

Modifying each element while iterating a list is fine, as long as you do not change add/remove elements to list.

You can use list comprehension:

l = ['a', ' list', 'of ', ' string ']
l = [item.strip() for item in l]

or just do the C-style for loop:

for index, item in enumerate(l):
    l[index] = item.strip()
cizixs
  • 12,931
  • 6
  • 48
  • 60
6

The answer given by Ignacio Vazquez-Abrams is really good. It can be further illustrated by this example. Imagine that:

  1. A list with two vectors is given to you.
  2. You would like to traverse the list and reverse the order of each one of the arrays.

Let's say you have:

v = np.array([1,2,3,4])
b = np.array([3,4,6])

for i in [v, b]:
    i = i[::-1]   # This command does not reverse the string.

print([v,b])

You will get:

[array([1, 2, 3, 4]), array([3, 4, 6])]

On the other hand, if you do:

v = np.array([1,2,3,4])
b = np.array([3,4,6])

for i in [v, b]:
   i[:] = i[::-1]   # This command reverses the string.

print([v,b])

The result is:

[array([4, 3, 2, 1]), array([6, 4, 3])]
martineau
  • 119,623
  • 25
  • 170
  • 301
4

No you wouldn't alter the "content" of the list, if you could mutate strings that way. But in Python they are not mutable. Any string operation returns a new string.

If you had a list of objects you knew were mutable, you could do this as long as you don't change the actual contents of the list.

Thus you will need to do a map of some sort. If you use a generator expression it [the operation] will be done as you iterate and you will save memory.

Skurmedel
  • 21,515
  • 5
  • 53
  • 66
4

You can do something like this:

a = [1,2,3,4,5]
b = [i**2 for i in a]

It's called a list comprehension, to make it easier for you to loop inside a list.

ktdrv
  • 3,602
  • 3
  • 30
  • 45
Nenoj
  • 197
  • 1
  • 8
1

It is not clear from your question what the criteria for deciding what strings to remove is, but if you have or can make a list of the strings that you want to remove , you could do the following:

my_strings = ['a','b','c','d','e']
undesirable_strings = ['b','d']
for undesirable_string in undesirable_strings:
    for i in range(my_strings.count(undesirable_string)):
        my_strings.remove(undesirable_string)

which changes my_strings to ['a', 'c', 'e']

Jorge
  • 191
  • 2
  • 8
1

In short, to do modification on the list while iterating the same list.

list[:] = ["Modify the list" for each_element in list "Condition Check"]

example:

list[:] = [list.remove(each_element) for each_element in list if each_element in ["data1", "data2"]]
siva balan
  • 389
  • 3
  • 6
-1

Something I just discovered - when looping over a list of mutable types (such as dictionaries) you can just use a normal for loop like this:

l = [{"n": 1}, {"n": 2}]
for d in l:
    d["n"] += 1
print(l)
# prints [{"n": 2}, {"n": 1}]
TheEagle
  • 5,808
  • 3
  • 11
  • 39