346

I've seen there are actually two (maybe more) ways to concatenate lists in Python:

One way is to use the extend() method:

a = [1, 2]
b = [2, 3]
b.extend(a)

the other to use the plus (+) operator:

b += a

Now I wonder: which of those two options is the 'pythonic' way to do list concatenation and is there a difference between the two? (I've looked up the official Python tutorial but couldn't find anything anything about this topic).

vvvvv
  • 25,404
  • 19
  • 49
  • 81
helpermethod
  • 59,493
  • 71
  • 188
  • 276
  • 1
    Maybe the difference has more implications when it comes to ducktyping and if your *maybe-not-really-a-list-but-like-a-list* supports `.__iadd__()`/`.__add__()`/`.__radd__()` versus `.extend()` – Nick T Dec 15 '14 at 22:22

11 Answers11

293

The only difference on a bytecode level is that the .extend way involves a function call, which is slightly more expensive in Python than the INPLACE_ADD.

It's really nothing you should be worrying about, unless you're performing this operation billions of times. It is likely, however, that the bottleneck would lie some place else.

jesterjunk
  • 2,342
  • 22
  • 18
SilentGhost
  • 307,395
  • 66
  • 306
  • 293
  • 22
    Maybe the difference has more implications when it comes to ducktyping and if your *maybe-not-really-a-list-but-like-a-list* supports `.__iadd__()`/`.__add__()`/`.__radd__()` versus `.extend()` – Nick T Dec 15 '14 at 22:21
  • 16
    This answer fails to mention the important scoping differences. – wim Jan 27 '17 at 16:11
  • 12
    Well actually, extends is faster than the INPLACE_ADD() i.e. the list concatenation. https://gist.github.com/mekarpeles/3408081 – Archit Jul 18 '18 at 07:03
  • 3
    For me, this answer didn't really help me decide which one I should use as a general principle. I think consistency is important, and knowing things like how it can't be used with non-locals, and can't be chained (from the other answers) provides a more practical, functional reason to use `extend()` over the operator, even when there's a choice. "Billions of operations" use case is a valid point, but not one I run into more than a handful of times in my career. – void.pointer Feb 17 '21 at 17:18
  • 4
    `.extend` is faster than `+`. There is nothing to do with extend having an extra function call. `+` is an operator and it also causes a function call. The reason `.extend` is faster is because it does much less work. `+` will (1) create a list, copy all elements (references) from that list, then it will get the second list and add the references. `.extend` will not create a new list nor copy references elements from that list. extend is equivalent to `a[len(a):] = iterable`. extend will operate over the list you are doing the operation and should be used instead of `L = L + iterable` – fsan Feb 01 '23 at 16:58
234

You can't use += for non-local variable (variable which is not local for function and also not global)

def main():
    l = [1, 2, 3]

    def foo():
        l.extend([4])

    def boo():
        l += [5]

    foo()
    print l
    boo()  # this will fail

main()

It's because for extend case compiler will load the variable l using LOAD_DEREF instruction, but for += it will use LOAD_FAST - and you get *UnboundLocalError: local variable 'l' referenced before assignment*

Donbeo
  • 17,067
  • 37
  • 114
  • 188
monitorius
  • 3,566
  • 1
  • 20
  • 17
  • 6
    I having difficulties with your explanation "variable which is **not local** for function and also **not global**" could you give example of such a variable ? – Stephane Rolland Aug 07 '14 at 07:44
  • 11
    Variable 'l' in my example is exactly of that kind. It's not local for 'foo' and 'boo' functions (outside of their scopes), but it's not global (defined inside 'main' func, not on module level) – monitorius Aug 07 '14 at 09:34
  • 3
    I can confirm that this error still occurs with python 3.4.2 (you'll need to add parentheses to print but everything else can stay the same). – trichoplax is on Codidact now Jul 01 '15 at 15:53
  • 8
    That's right. But at least you can use *nonlocal l* statement in *boo* in Python3. – monitorius Jul 02 '15 at 09:58
  • 1
    compiler -> interpreter? – joel Mar 17 '20 at 16:21
  • 1
    A compiler from program text to python byte-code will create these instructions, and then interpreter of byte-code will run them =) – monitorius Mar 18 '20 at 18:12
72

You can chain function calls, but you can't += a function call directly:

class A:
    def __init__(self):
        self.listFoo = [1, 2]
        self.listBar = [3, 4]

    def get_list(self, which):
        if which == "Foo":
            return self.listFoo
        return self.listBar

a = A()
other_list = [5, 6]

a.get_list("Foo").extend(other_list)
a.get_list("Foo") += other_list  #SyntaxError: can't assign to function call
isarandi
  • 3,120
  • 25
  • 35
9

I would say that there is some difference when it comes with numpy (I just saw that the question ask about concatenating two lists, not numpy array, but since it might be a issue for beginner, such as me, I hope this can help someone who seek the solution to this post), for ex.

import numpy as np
a = np.zeros((4,4,4))
b = []
b += a

it will return with error

ValueError: operands could not be broadcast together with shapes (0,) (4,4,4)

b.extend(a) works perfectly

Lance Ruo Zhang
  • 401
  • 4
  • 12
6

The .extend() method on lists works with any iterable*, += works with some but can get funky.

import numpy as np

l = [2, 3, 4]
t = (5, 6, 7)
l += t
l
[2, 3, 4, 5, 6, 7]

l = [2, 3, 4]
t = np.array((5, 6, 7))
l += t
l
array([ 7,  9, 11])

l = [2, 3, 4]
t = np.array((5, 6, 7))
l.extend(t)
l
[2, 3, 4, 5, 6, 7]

Python 3.6
*pretty sure .extend() works with any iterable but please comment if I am incorrect

Edit: "extend()" changed to "The .extend() method on lists" Note: David M. Helmuth's comment below is nice and clear.

grofte
  • 1,839
  • 1
  • 16
  • 15
  • Tuple is definitely an iterable, but it has no extend() method. extend() method has nothing to do with iteration. – wombatonfire Mar 22 '19 at 23:00
  • .extend is a method of the list class. From the Python documentation: `list.extend(iterable) Extend the list by appending all the items from the iterable. Equivalent to a[len(a):] = iterable.` Guess I answered my own asterisk. – grofte Mar 23 '19 at 11:57
  • Oh, you meant that you can pass any iterable to extend(). I read it as "extend() is available for any iterable" :) My bad, but it sounds a little ambiguous. – wombatonfire Mar 23 '19 at 12:03
  • 1
    All in all, this is not a good example, at least not in the context of this question. When you use a `+=` operator with objects of different types (contrary to two lists, as in the question), you can't expect that you will get a concatenation of the objects. And you can't expect that there will be a `list` type returned. Have a look at your code, you get an `numpy.ndarray` instead of `list`. – wombatonfire Mar 23 '19 at 13:18
  • 1
    @grofte provided the correct answer; however, the answer requires some clarification, so here is my suggested clarification: *When using the extend() method to concatenate your list with the values held in another iterable, you get consistent behaviour regardless of whether the other iterable is a list, tuple or even a NumPy array. That consistent behaviour isn’t the case when using the += operator to concatenate a second iterable to a list. (See examples given below) - That's how they differ* – David M. Helmuth Jul 05 '22 at 21:23
6

ary += ext creates a new List object, then copies data from lists "ary" and "ext" into it.

ary.extend(ext) merely adds reference to "ext" list to the end of the "ary" list, resulting in less memory transactions.

As a result, .extend works orders of magnitude faster and doesn't use any additional memory outside of the list being extended and the list it's being extended with.

╰─➤ time ./list_plus.py
./list_plus.py  36.03s user 6.39s system 99% cpu 42.558 total
╰─➤ time ./list_extend.py
./list_extend.py  0.03s user 0.01s system 92% cpu 0.040 total

The first script also uses over 200MB of memory, while the second one doesn't use any more memory than a 'naked' python3 process.

Having said that, the in-place addition does seem to do the same thing as .extend.

dniq
  • 61
  • 1
  • 1
5

From the CPython 3.5.2 source code: No big difference.

static PyObject *
list_inplace_concat(PyListObject *self, PyObject *other)
{
    PyObject *result;

    result = listextend(self, other);
    if (result == NULL)
        return result;
    Py_DECREF(result);
    Py_INCREF(self);
    return (PyObject *)self;
}
Flux
  • 9,805
  • 5
  • 46
  • 92
VicX
  • 721
  • 8
  • 13
5

Actually, there are differences among the three options: ADD, INPLACE_ADD and extend. The former is always slower, while the other two are roughly the same.

With this information, I would rather use extend, which is faster than ADD, and seems to me more explicit of what you are doing than INPLACE_ADD.

Try the following code a few times (for Python 3):

import time

def test():
    x = list(range(10000000))
    y = list(range(10000000))
    z = list(range(10000000))

    # INPLACE_ADD
    t0 = time.process_time()
    z += x
    t_inplace_add = time.process_time() - t0

    # ADD
    t0 = time.process_time()
    w = x + y
    t_add = time.process_time() - t0

    # Extend
    t0 = time.process_time()
    x.extend(y)
    t_extend = time.process_time() - t0

    print('ADD {} s'.format(t_add))
    print('INPLACE_ADD {} s'.format(t_inplace_add))
    print('extend {} s'.format(t_extend))
    print()

for i in range(10):
    test()
ADD 0.3540440000000018 s
INPLACE_ADD 0.10896000000000328 s
extend 0.08370399999999734 s

ADD 0.2024550000000005 s
INPLACE_ADD 0.0972940000000051 s
extend 0.09610200000000191 s

ADD 0.1680199999999985 s
INPLACE_ADD 0.08162199999999586 s
extend 0.0815160000000077 s

ADD 0.16708400000000267 s
INPLACE_ADD 0.0797719999999913 s
extend 0.0801490000000058 s

ADD 0.1681250000000034 s
INPLACE_ADD 0.08324399999999343 s
extend 0.08062700000000689 s

ADD 0.1707760000000036 s
INPLACE_ADD 0.08071900000000198 s
extend 0.09226200000000517 s

ADD 0.1668420000000026 s
INPLACE_ADD 0.08047300000001201 s
extend 0.0848089999999928 s

ADD 0.16659500000000094 s
INPLACE_ADD 0.08019399999999166 s
extend 0.07981599999999389 s

ADD 0.1710910000000041 s
INPLACE_ADD 0.0783479999999912 s
extend 0.07987599999999873 s

ADD 0.16435900000000458 s
INPLACE_ADD 0.08131200000001115 s
extend 0.0818660000000051 s
dalonsoa
  • 215
  • 3
  • 11
  • 2
    You can't compare `ADD` with `INPLACE_ADD` and `extend()`. `ADD` produces a new list and copies the elements of the two original lists to it. For sure it will be slower than inplace operation of `INPLACE_ADD` and `extend()`. – wombatonfire Mar 22 '19 at 23:05
  • 3
    I know that. The point of this example is comparing different ways of having a list with all elements together. Sure it takes longer because it does different things, but still it is good to know in case you are interested in preserving the original objects unaltered. – dalonsoa Apr 10 '19 at 13:40
3

I've looked up the official Python tutorial but couldn't find anything anything about this topic

This information happens to be buried in the Programming FAQ:

... for lists, __iadd__ [i.e. +=] is equivalent to calling extend on the list and returning the list. That's why we say that for lists, += is a "shorthand" for list.extend

You can also see this for yourself in the CPython source code: https://github.com/python/cpython/blob/v3.8.2/Objects/listobject.c#L1000-L1011

Flux
  • 9,805
  • 5
  • 46
  • 92
1

Only .extend() can be used when the list is in a tuple

This will work

t = ([],[])
t[0].extend([1,2])

while this won't

t = ([],[])
t[0] += [1,2]

The reason is that += generates a new object. If you look at the long version:

t[0] = t[0] + [1,2]

you can see how that would change which object is in the tuple, which is not possible. Using .extend() modifies an object in the tuple, which is allowed.

Jann Poppinga
  • 444
  • 4
  • 18
-1

According to the Python for Data Analysis.

“Note that list concatenation by addition is a comparatively expensive operation since a new list must be created and the objects copied over. Using extend to append elements to an existing list, especially if you are building up a large list, is usually preferable. ” Thus,

everything = []
for chunk in list_of_lists:
    everything.extend(chunk)

is faster than the concatenative alternative:

everything = []
for chunk in list_of_lists:
    everything = everything + chunk

enter image description here enter image description here

littlebear333
  • 710
  • 2
  • 6
  • 14
  • 5
    `everything = everything + temp` is not necessarily implemented in the same way as `everything += temp`. – David Harrison Aug 01 '18 at 22:12
  • 1
    You are right. Thank you for your reminder. But my point is about the difference of efficiency. : ) – littlebear333 Aug 02 '18 at 05:35
  • 8
    @littlebear333 `everything += temp` is implemented in a way such that `everything` does not need to be copied. This pretty much makes your answer a moot point. – nog642 Aug 19 '18 at 20:27