Finding and replacing elements in a list

Question

I have to search through a list and replace all occurrences of one element with another. So far my attempts in code are getting me nowhere, what is the best way to do this?

For example, suppose my list has the following integers

a = [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1]

and I need to replace all occurrences of the number 1 with the value 10 so the output I need is

a = [10, 2, 3, 4, 5, 10, 2, 3, 4, 5, 10]

Thus my goal is to replace all instances of the number 1 with the number 10.

score 692 · Answer 1 · edited Sep 23 '21 at 17:47

692

Try using a list comprehension and a conditional expression.

>>> a=[1,2,3,1,3,2,1,1]
>>> [4 if x==1 else x for x in a]
[4, 2, 3, 4, 3, 2, 4, 4]

edited Sep 23 '21 at 17:47

Ry-

218,210
55
464
476

answered Apr 06 '10 at 01:37

outis

75,655
22
151
221

10

But this doesn't change `a` though right? I think OP wanted `a` to change – Dula Feb 03 '16 at 23:26
18

@Dula you can do a = [4 if x==1 else x for x in a], this will effect a – Alekhya Vemavarapu Apr 11 '16 at 07:57
@Dula: the question is vague as to whether `a` should mutate, but (as Alekhya shows) it's trivial to handle either case when using a list comprehension. – outis Aug 15 '16 at 11:55
66

If you want to mutate `a` then you should do `a[:] = [4 if x==1 else x for x in a]` (note the full list slice). Just doing the `a =` will create a new list `a` with a different `id()` (identity) from the original one – Chris_Rands Apr 25 '17 at 12:15
1

Just for evaluation purposes, note that this solution is by far the most consistent on time among the fast solutions (it doesn't matter if the item to replace is common or rare, runtime stays effectively constant). When the `list` is mostly items that stay unchanged, this is slower than optimized in-place solutions like [kxr's answer](https://stackoverflow.com/a/59478892/364696). kxr's answer, for len 1000 inputs, takes anywhere from ⅓ the time of this solution (when there are no items that need to be replaced) to 3x as long (when all items must be replaced); much more variable. – ShadowRanger Feb 25 '21 at 15:55

score 315 · Accepted Answer · edited Jan 26 '22 at 15:54

315

You can use the built-in enumerate to get both index and value while iterating the list. Then, use the value to test for a condition and the index to replace that value in the original list:

>>> a = [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1]
>>> for i, n in enumerate(a):
...   if n == 1:
...      a[i] = 10
...
>>> a
[10, 2, 3, 4, 5, 10, 2, 3, 4, 5, 10]

edited Jan 26 '22 at 15:54

Tomerikoo

18,379
16
47
61

answered Apr 06 '10 at 01:41

ghostdog74

327,991
56
259
343

42

This is a bad and very un-pythonic solution. Consider using list comprehension. – AdHominem Dec 31 '16 at 11:56
308

This is a fine if very un-pythonic solution. Consider using list comprehension. – Jean-François Corbett Feb 02 '17 at 13:35
Consider using list comprehension, such as is done by @outis below! – amc Feb 23 '19 at 00:02
8

This performs better than list comprehension though doesn't it? It does in-place updates instead of generating a new list. – neverendingqs May 13 '19 at 18:59
2

@neverendingqs: No. Interpreter overhead dominates the operation, and the comprehension has less of it. The comprehension performs slightly better, especially with a higher proportion of elements passing the replacement condition. Have some timings: https://ideone.com/ZrCy6z – user2357112 Mar 31 '20 at 23:26
3

This is really slow in comparison to using native list methods like `.index(10)`. There is no reason to list every list element to find a the elements that need to be replaced. Please see timing in my answer here. – dawg May 10 '20 at 19:09
I prefer to use use pythonic ways to do things on python because of both performance and readability improvements. Sometimes, while coding, we reach a case where we need to choose between a more readable solution and more performant solution (a readability X performance trade-off), but sometimes there is just an optimal solution designed for a specific language to handle a specific issue, which will be both readable and performant. List Comprehension is an example for this. – Victor Sep 13 '20 at 17:55
1

@dawg: The naïve `.index(10)` method either: 1) Only replaces once, thereby not meeting the OP's requirements, or 2) If done in a loop, is `O(n²)` work (if the `list` is `a = [1] * 1000`, it must perform 1000 `index` calls, each of which performs an average of `n/2` comparisons [`O(n)` work], to find the matching index). [kxr's answer](https://stackoverflow.com/a/59478892/364696) has the non-naïve version of the `.index`-based solution that's `O(n)` and will usually win over this answer (unless the list in question is more than a third or so items to replace). – ShadowRanger Feb 25 '21 at 15:46
Isn't this good in a **memory-efficient** way? I guess even the *in-place* solution using list comprehension `a[:]=[10 if e==1 else e for e in a]` requires more memory(size of `a`). – starriet Apr 10 '22 at 03:58

score 85 · Answer 3 · edited Aug 26 '21 at 04:41

85

If you have several values to replace, you can also use a dictionary:

a = [1, 2, 3, 4, 1, 5, 3, 2, 6, 1, 1]
replacements = {1:10, 2:20, 3:'foo'}
replacer = replacements.get  # For faster gets.

print([replacer(n, n) for n in a])

> [10, 20, 'foo', 4, 10, 5, 'foo', 20, 6, 10, 10]

Note that this approach works only if the elements to be replaced are hashable. This is because dict keys are required to be hashable.

edited Aug 26 '21 at 04:41

Asclepius

57,944
17
167
143

answered Jul 11 '16 at 16:28

roipoussiere

5,142
3
28
37

1

@jrjc @roipoussiere for in-place replacements, the `try-except` is at least 50% faster! Take a look at this [answer](http://stackoverflow.com/a/24203748/307454) – lifebalance Nov 23 '16 at 03:25
Thanks! `try-except` is faster but it break the loop at the first occurrence of unknown item, and `dic.get(n,n)` is beautiful but slower than `if n in dic`. I edited my answer. – roipoussiere Jan 02 '17 at 14:26
`dic.get(n,n)` is very slightly slower than `dic[n] if n in dic else n` but much more readable (IMHO) so I would suggest using it for anything unless you are really trying to optimize some loop. – Iftah Dec 12 '17 at 11:44
1

This will fail for unhashable elements. It is a problem of all naive dict based substitutions. (just try with `[1, {'boom!'}, 3]`) – VPfB Jul 09 '19 at 09:29
1

@Iftah: If you're super-concerned about performance, pre-binding the `get` method outside the listcomp will dramatically reduce runtime for large inputs. Since you don't actually need a reference to the `dict` itself, you could just change it to `dget = {1:10, 2:20, 3:'foo'}.get`, and the listcomp to `[dget(n, n) for n in a]`. Even in CPython 3.9, which significantly optimized method calls (it no longer needs to create a bound method object in simple cases), this still reduces overhead for a len 1000 input by ~30% (by replacing `LOAD_METHOD`/`CALL_METHOD` with just `CALL_FUNCTION`). – ShadowRanger Feb 25 '21 at 16:07
1

The pre-bound `get` optimization brings this to the point of being comparable with the `dic[n] if n in dic else n` approach (takes 20-30% longer in most of my test cases, vs. 60-100% longer when you have to look up `dic.get` on every loop). – ShadowRanger Feb 25 '21 at 16:12
This is a beautiful and very clever solution, thanks! – MarMat May 10 '22 at 00:42

score 42 · Answer 4 · edited Jan 17 '19 at 05:24

42

List comprehension works well, and looping through with enumerate can save you some memory (b/c the operation's essentially being done in place).

There's also functional programming. See usage of map:

>>> a = [1,2,3,2,3,4,3,5,6,6,5,4,5,4,3,4,3,2,1]
>>> map(lambda x: x if x != 4 else 'sss', a)
[1, 2, 3, 2, 3, 'sss', 3, 5, 6, 6, 5, 'sss', 5, 'sss', 3, 'sss', 3, 2, 1]

edited Jan 17 '19 at 05:24

wjandrea

28,235
9
60
81

answered Apr 06 '10 at 03:57

damzam

1,921
15
18

20

+1. It's too bad `lambda` and `map` are considered unpythonic. – outis Apr 07 '10 at 00:02
6

I'm not sure that lambda or map is inherently unpythonic, but I'd agree that a list comprehension is cleaner and more readable than using the two of them in conjunction. – damzam Apr 07 '10 at 02:14
7

I don't consider them unpythonic myself, but many do, including Guido van Rossum (http://www.artima.com/weblogs/viewpost.jsp?thread=98196). It's one of those sectarian things. – outis Apr 08 '10 at 01:29
@outis: `map`+`lambda` is less readable *and* slower than the equivalent listcomp. You can squeeze some performance out of `map` when the mapping function is a built-in implemented in C and the input is large enough for `map`'s per-item benefits to overcome the slightly higher fixed overhead, but when `map` needs a Python level function (e.g. a `lambda`) an equivalent genexpr/listcomp could inline (avoiding function call overhead), `map` really provides no benefit at all (as of 3.9, for a simple test case over `a = [*range(10)] * 100`, this `map` takes 2x as long as the equivalent listcomp). – ShadowRanger Feb 25 '21 at 16:19
Personally, I reserve my ire largely for `lambda`; I like `map` when I already have a function that does what I need laying around (the function is probably complicated enough to not be worth inlining in the listcomp anyway, or it's a built-in you can't inline anyway, e.g. `for line in map(str.rstrip, fileob):` to get the lines from a file one-by-one prestripped), but if I don't have such a function, I'd have to use a `lambda`, which ends up uglier and slower, as previously noted, so I may as well use the listcomp/genexpr. – ShadowRanger Feb 25 '21 at 16:21

score 12 · Answer 5 · edited Jan 26 '22 at 16:02

12

On long lists and rare occurrences its about 3x faster using list.index() - compared to single step iteration methods presented in the other answers.

def list_replace(lst, old=1, new=10):
    """replace list elements (inplace)"""
    i = -1
    try:
        while True:
            i = lst.index(old, i + 1)
            lst[i] = new
    except ValueError:
        pass

edited Jan 26 '22 at 16:02

Tomerikoo

18,379
16
47
61

answered Dec 25 '19 at 13:34

kxr

4,841
1
49
32

This is the fastest method I have found. Please see timings in my answer. Great! – dawg May 10 '20 at 19:05
Note that the naïve version of this (without using `i` to provide a `start` argument for `list.index`) is `O(n²)`; in a simple local test, where the `lst` argument is the result of `list(range(10)) * 100` (1000 element `list`, where 100 elements, evenly spaced, get replaced), this is a noticeable; this answer (which is not naïve, and achieves `O(1)` performance) does the work in about 25 µs, where the naïve version took about 615 µs on the same machine. – ShadowRanger Feb 25 '21 at 15:38

score 11 · Answer 6 · answered Apr 06 '10 at 01:37

11

>>> a=[1,2,3,4,5,1,2,3,4,5,1]
>>> item_to_replace = 1
>>> replacement_value = 6
>>> indices_to_replace = [i for i,x in enumerate(a) if x==item_to_replace]
>>> indices_to_replace
[0, 5, 10]
>>> for i in indices_to_replace:
...     a[i] = replacement_value
... 
>>> a
[6, 2, 3, 4, 5, 6, 2, 3, 4, 5, 6]
>>>

answered Apr 06 '10 at 01:37

John La Rooy

295,403
53
369
502

Medium fast but very sensible method. Please see timings in my answer. – dawg May 10 '20 at 19:06

score 8 · Answer 7 · answered Oct 10 '19 at 11:29

8

I know this is a very old question and there's a myriad of ways to do it. The simpler one I found is using numpy package.

import numpy

arr = numpy.asarray([1, 6, 1, 9, 8])
arr[ arr == 8 ] = 0 # change all occurrences of 8 by 0
print(arr)

answered Oct 10 '19 at 11:29

Tiago Vieira

160
1
5

2

Assuming you're already using `numpy`, this is a great solution; it's the same `O(n)` as all the other good solutions, but pushing all the work to vectorized C layer operations means it will outperform the other solutions dramatically by virtue of eliminating per-item interpreter overhead. – ShadowRanger Feb 25 '21 at 16:44

score 7 · Answer 8 · answered Jan 20 '20 at 15:10

My usecase was replacing None with some default value.

I've timed approaches to this problem that were presented here, including the one by @kxr - using str.count.

Test code in ipython with Python 3.8.1:

def rep1(lst, replacer = 0):
    ''' List comprehension, new list '''

    return [item if item is not None else replacer for item in lst]


def rep2(lst, replacer = 0):
    ''' List comprehension, in-place '''    
    lst[:] =  [item if item is not None else replacer for item in lst]

    return lst


def rep3(lst, replacer = 0):
    ''' enumerate() with comparison - in-place '''
    for idx, item in enumerate(lst):
        if item is None:
            lst[idx] = replacer

    return lst


def rep4(lst, replacer = 0):
    ''' Using str.index + Exception, in-place '''

    idx = -1
    # none_amount = lst.count(None)
    while True:
        try:
            idx = lst.index(None, idx+1)
        except ValueError:
            break
        else:
            lst[idx] = replacer

    return lst


def rep5(lst, replacer = 0):
    ''' Using str.index + str.count, in-place '''

    idx = -1
    for _ in range(lst.count(None)):
        idx = lst.index(None, idx+1)
        lst[idx] = replacer

    return lst


def rep6(lst, replacer = 0):
    ''' Using map, return map iterator '''

    return map(lambda item: item if item is not None else replacer, lst)


def rep7(lst, replacer = 0):
    ''' Using map, return new list '''

    return list(map(lambda item: item if item is not None else replacer, lst))


lst = [5]*10**6
# lst = [None]*10**6

%timeit rep1(lst)    
%timeit rep2(lst)    
%timeit rep3(lst)    
%timeit rep4(lst)    
%timeit rep5(lst)    
%timeit rep6(lst)    
%timeit rep7(lst)

I get:

26.3 ms ± 163 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
29.3 ms ± 206 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
33.8 ms ± 191 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
11.9 ms ± 37.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
11.9 ms ± 60.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
260 ns ± 1.84 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
56.5 ms ± 204 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using the internal str.index is in fact faster than any manual comparison.

I didn't know if the exception in test 4 would be more laborious than using str.count, the difference seems negligible.

Note that map() (test 6) returns an iterator and not an actual list, thus test 7.

You've shown that using the internal `str.index` is faster if you have nothing to replace. If all elements are `None` I'd expect `rep4` and `rep5` to be very slow, as the method is O(nm), whereas the others are O(n), with n elements and m `None` values. — Cris Luengo, Jan 14 '21 at 08:26
@CrisLuengo: `rep4`/`rep5` scale fine; they both uses a `start` parameter based on the position of the last replacement, so they remain `O(n)`; `index` is `O(n)` if run on the whole `list` every time, but the `start` parameter ensures all the `index` calls put together traverses each index of the `list` exactly once. They get slower as the number of hits goes up, but for non-big-O related reasons (fixed overhead of the `index` call paid more); in practice, using [kxr's better version of `rep4`](https://stackoverflow.com/a/59478892/364696), 1000 `None`s only takes ~3x longer than 1000 `1`s. — ShadowRanger, Feb 25 '21 at 16:38
If you do want to incorporate that fixed overhead of the `index` calls, the real work done is `O(n + m)`, not `O(nm)`; you pay the fixed overhead of `index` (along with the associated work of reassigning values) once for each `None`, and the cumulative non-fixed overhead cost of all the `index` calls put together is `O(n)` in terms of the length of the `list`. Real big-O computations would still call it `O(n)` though, since `m` is bounded by `n`, meaning `m` can be interpreted as just another `n` term, and `O(n + n)` is the same as `O(n)` (since constant coeffcients in `2n` are dropped). — ShadowRanger, Feb 25 '21 at 16:40
@ShadowRanger: Thanks, I thought `index` searched from the beginning every time, didn't pay enough attention. My comment about the test still stands though: it's showing times when nothing needs to be replaced. Better test data would be necessary. — Cris Luengo, Feb 25 '21 at 17:07

dawg · Answer 9 · 2021-02-27T03:11:16.393

The answers for this old but relevant question are wildly variable in speed.

The fastest of the solution posted by kxr.

However, this is even faster and otherwise not here:

def f1(arr, find, replace):
    # fast and readable
    base=0
    for cnt in range(arr.count(find)):
        offset=arr.index(find, base)
        arr[offset]=replace
        base=offset+1

Here is timing for the various solutions. The faster ones are 3X faster than accepted answer and 5X faster than the slowest answer here.

To be fair, all methods needed to do inlace replacement of the array sent to the function.

Please see timing code below:

def f1(arr, find, replace):
    # fast and readable
    base=0
    for cnt in range(arr.count(find)):
        offset=arr.index(find, base)
        arr[offset]=replace
        base=offset+1
        
def f2(arr,find,replace):
    # accepted answer
    for i,e in enumerate(arr):
        if e==find: 
            arr[i]=replace
        
def f3(arr,find,replace):
    # in place list comprehension
    arr[:]=[replace if e==find else e for e in arr]
    
def f4(arr,find,replace):
    # in place map and lambda -- SLOW
    arr[:]=list(map(lambda x: x if x != find else replace, arr))
    
def f5(arr,find,replace):
    # find index with comprehension
    for i in [i for i, e in enumerate(arr) if e==find]:
        arr[i]=replace
        
def f6(arr,find,replace):
    # FASTEST but a little les clear
    try:
        while True:
            arr[arr.index(find)]=replace
    except ValueError:
        pass    

def f7(lst, old, new):
    """replace list elements (inplace)"""
    i = -1
    try:
        while 1:
            i = lst.index(old, i + 1)
            lst[i] = new
    except ValueError:
        pass
    
    
import time     

def cmpthese(funcs, args=(), cnt=1000, rate=True, micro=True):
    """Generate a Perl style function benchmark"""                   
    def pprint_table(table):
        """Perl style table output"""
        def format_field(field, fmt='{:,.0f}'):
            if type(field) is str: return field
            if type(field) is tuple: return field[1].format(field[0])
            return fmt.format(field)     

        def get_max_col_w(table, index):
            return max([len(format_field(row[index])) for row in table])         

        col_paddings=[get_max_col_w(table, i) for i in range(len(table[0]))]
        for i,row in enumerate(table):
            # left col
            row_tab=[row[0].ljust(col_paddings[0])]
            # rest of the cols
            row_tab+=[format_field(row[j]).rjust(col_paddings[j]) for j in range(1,len(row))]
            print(' '.join(row_tab))                

    results={}
    for i in range(cnt):
        for f in funcs:
            start=time.perf_counter_ns()
            f(*args)
            stop=time.perf_counter_ns()
            results.setdefault(f.__name__, []).append(stop-start)
    results={k:float(sum(v))/len(v) for k,v in results.items()}     
    fastest=sorted(results,key=results.get, reverse=True)
    table=[['']]
    if rate: table[0].append('rate/sec')
    if micro: table[0].append('\u03bcsec/pass')
    table[0].extend(fastest)
    for e in fastest:
        tmp=[e]
        if rate:
            tmp.append('{:,}'.format(int(round(float(cnt)*1000000.0/results[e]))))

        if micro:
            tmp.append('{:,.1f}'.format(results[e]/float(cnt)))

        for x in fastest:
            if x==e: tmp.append('--')
            else: tmp.append('{:.1%}'.format((results[x]-results[e])/results[e]))
        table.append(tmp) 

    pprint_table(table)                    



if __name__=='__main__':
    import sys
    import time 
    print(sys.version)
    cases=(
        ('small, found', 9, 100),
        ('small, not found', 99, 100),
        ('large, found', 9, 1000),
        ('large, not found', 99, 1000)
    )
    for txt, tgt, mul in cases:
        print(f'\n{txt}:')
        arr=[1,2,3,4,5,6,7,8,9,0]*mul 
        args=(arr,tgt,'X')
        cmpthese([f1,f2,f3, f4, f5, f6, f7],args)

And the results:

3.9.1 (default, Feb  3 2021, 07:38:02) 
[Clang 12.0.0 (clang-1200.0.32.29)]

small, found:
   rate/sec μsec/pass     f4     f3     f5     f2     f6     f7     f1
f4  133,982       7.5     -- -38.8% -49.0% -52.5% -78.5% -78.6% -82.9%
f3  219,090       4.6  63.5%     -- -16.6% -22.4% -64.8% -65.0% -72.0%
f5  262,801       3.8  96.1%  20.0%     --  -6.9% -57.8% -58.0% -66.4%
f2  282,259       3.5 110.7%  28.8%   7.4%     -- -54.6% -54.9% -63.9%
f6  622,122       1.6 364.3% 184.0% 136.7% 120.4%     --  -0.7% -20.5%
f7  626,367       1.6 367.5% 185.9% 138.3% 121.9%   0.7%     -- -19.9%
f1  782,307       1.3 483.9% 257.1% 197.7% 177.2%  25.7%  24.9%     --

small, not found:
   rate/sec μsec/pass     f4     f5     f2     f3     f6     f7     f1
f4   13,846      72.2     -- -40.3% -41.4% -47.8% -85.2% -85.4% -86.2%
f5   23,186      43.1  67.5%     --  -1.9% -12.5% -75.2% -75.5% -76.9%
f2   23,646      42.3  70.8%   2.0%     -- -10.8% -74.8% -75.0% -76.4%
f3   26,512      37.7  91.5%  14.3%  12.1%     -- -71.7% -72.0% -73.5%
f6   93,656      10.7 576.4% 303.9% 296.1% 253.3%     --  -1.0%  -6.5%
f7   94,594      10.6 583.2% 308.0% 300.0% 256.8%   1.0%     --  -5.6%
f1  100,206      10.0 623.7% 332.2% 323.8% 278.0%   7.0%   5.9%     --

large, found:
   rate/sec μsec/pass     f4     f2     f5     f3     f6     f7     f1
f4      145   6,889.4     -- -33.3% -34.8% -48.6% -85.3% -85.4% -85.8%
f2      218   4,593.5  50.0%     --  -2.2% -22.8% -78.0% -78.1% -78.6%
f5      223   4,492.4  53.4%   2.3%     -- -21.1% -77.5% -77.6% -78.2%
f3      282   3,544.0  94.4%  29.6%  26.8%     -- -71.5% -71.6% -72.3%
f6      991   1,009.5 582.4% 355.0% 345.0% 251.1%     --  -0.4%  -2.8%
f7      995   1,005.4 585.2% 356.9% 346.8% 252.5%   0.4%     --  -2.4%
f1    1,019     981.3 602.1% 368.1% 357.8% 261.2%   2.9%   2.5%     --

large, not found:
   rate/sec μsec/pass     f4     f5     f2     f3     f6     f7     f1
f4      147   6,812.0     -- -35.0% -36.4% -48.9% -85.7% -85.8% -86.1%
f5      226   4,424.8  54.0%     --  -2.0% -21.3% -78.0% -78.1% -78.6%
f2      231   4,334.9  57.1%   2.1%     -- -19.6% -77.6% -77.7% -78.2%
f3      287   3,484.0  95.5%  27.0%  24.4%     -- -72.1% -72.2% -72.8%
f6    1,028     972.3 600.6% 355.1% 345.8% 258.3%     --  -0.4%  -2.7%
f7    1,033     968.2 603.6% 357.0% 347.7% 259.8%   0.4%     --  -2.3%
f1    1,057     946.2 619.9% 367.6% 358.1% 268.2%   2.8%   2.3%     --

Your f1 and f6 are O(n^2), so for large enough lists they will eventually be much slower than the O(n) solutions. It's possibly worth finding the approximate crossover and switching strategies for some length of list. — John La Rooy, May 11 '20 at 02:09
@dawg: `f6` is `O(n²)` because it uses `index` internally without adjusting the start position for the search. In a `list` consisting solely of things to replace, that means `n` calls to `index`, each of which do an average of `n / 2` work (the first one is `1` work, the last `n` work, it counts up in between; the first element of the `list` is checked `n` times, the second `n - 1` times, etc.). [kxr's answer](https://stackoverflow.com/a/59478892/364696) tracks the position of each replacement and uses it to avoid rechecking, keeping it to `O(n)`. — ShadowRanger, Feb 25 '21 at 21:13
@ShadowRanger: I took your comments and fixed `f1` so now it tracks the base offset. No more `O(n²)` — dawg, Feb 27 '21 at 01:48
@dawg: Yup, that works. Without the `+1`, it does rescan every element that it just replaced, so in the `list` of all elements to replace, it's checking each index twice, instead of just once, but that's a fixed multiplier that doesn't affect big-O (and avoiding the `+ 1` saves a surprisingly amount of work; the overhead of simple math is surprisingly high). Has one significant problem: It will go into an infinite loop if the replacement value compares equal to the search value, so if you're, say, replacing `1` with `True` or `1.0`, kaboom; I prefer kxr's approach for bulletproofing. — ShadowRanger, Feb 27 '21 at 02:07
Side-note: There is no need for separate values in the version you're using. You could remove the last line and replace all instances of `offset` with `base` and it would behave equivalently. Or go crazy and walrus it to make the body of the loop a one-liner: `arr[(base := arr.index(find, base))] = replace`. :-) — ShadowRanger, Feb 27 '21 at 02:10
@ShadowRanger This benchmark is btw messed up. It uses the same list over and over again. Meaning the first run of the first solution replaces all occurrences, and then the remaining 999 runs of that solution don't have to replace anything, and neither do the remaining solutions in their 1000 runs. — Kelly Bundy, Dec 14 '21 at 01:14

score 1 · Answer 10 · answered May 27 '23 at 04:29

In many cases, defining a replacer function and calling it in a loop is very readable. It's very useful if values need to be replaced using some rule. For example, to replace values depending on whether it's divisible by 15, 3 or 5, one can define a function that checks conditions and returns an appropriate value.

def fizzbuzz(num):
    if num % 15 == 0:
        return 'FizzBuzz'
    elif num % 3 == 0:
        return 'Fizz'
    elif num % 5 == 0:
        return 'Buzz'
    else:
        return num

a = [1, 2, 3, 4, 5, 10, 15, 1]
a[:] = (fizzbuzz(n) for n in a)
# or 
a[:] = map(fizzbuzz, a)
a # [1, 2, 'Fizz', 4, 1, 'Buzz', 'Fizz', 2, 'Fizz', 1, 1]

One thing to note is that (at least as of python 3.10) if a function needs to be called on every element in a list, then map() is faster than a comprehension.¹

This difference is even more pronounced for built-in methods such as dict.get().² So @roipoussiere's solution can be made twice as fast by simply mapping it like the following.

a = [1, 2, 3, 4, 1, 5, 3, 2, 6, 1, 1]
replacer = {1:10, 2:20, 3:'foo'}.get
a[:] = map(replacer, a, a)        # in-place replacement
a1 = [*map(replacer, a, a)]       # new list

¹ map() is ~20% faster than calling a function in a list comprehension in the example below.

a = [1, 2, 3, 4, 1, 5, 3, 2, 6, 1, 1]*10000
%timeit [fizzbuzz(n) for n in a]
# 15.1 ms ± 443 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%timeit list(map(fizzbuzz, a))
# 12.7 ms ± 27.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

² When replacing values using dict.get(), mapping it is ~2 times faster than calling it in a list comprehension.

a = [1, 2, 3, 4, 1, 5, 3, 2, 6, 1, 1]*10000
replacer = {1:10, 2:20, 3:'foo'}.get

%timeit [replacer(n,n) for n in a]
# 4.64 ms ± 195 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%timeit list(map(replacer, a, a))
# 2.21 ms ± 13.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

score 0 · Answer 11 · answered Nov 03 '21 at 22:49

0

I might be a dumb-dumb, but I would write a separate, simple function for this:

def convertElements( oldlist, convert_dict ):
  newlist = []
  for e in oldlist:
    if e in convert_dict:
      newlist.append(convert_dict[e])
    else:
      newlist.append(e)
  return newlist

And then call this as needed like so:

a = [1,2,3,4,5,1,2,3,4,5,1]
a_new = convertElements(a, {1: 10})
## OUTPUT: a_new=[10, 2, 3, 4, 5, 10, 2, 3, 4, 5, 10]

answered Nov 03 '21 at 22:49

Jerry Chen

321
3
4

No need for `if/else`. Simply do `newlist.append(convert_dict.get(e, e))`. The [`get`](https://docs.python.org/3/library/stdtypes.html#dict.get) method has a `default` argument that is returned if the key is not in the dict. So if it is not, return it... Then it can also become more conveniently a list-comp: `newlist = [convert_dict.get(e, e) for e in oldlist]` – Tomerikoo Jan 26 '22 at 15:57

Finding and replacing elements in a list

11 Answers11

Linked

Related