Runtime of merging two lists in Python

Question

Suppose we have two lists A = [a1, a2, ..., an](n elements), and B = [b1, b2, ..., bm](m elements), and we use "+" in Python to merge two lists into one, so

C = A + B;

My question is what the runtime of this operation is? My first guess is O(n+m), not sure if Python is smarter than that.

Addition of two lists *will* be O(n+m) because each Python list is implemented as a fixed-size C array. When you add two lists, you are allocating memory for a new array and copying each element of each member list into the array. Using append() and extend() will improve performance to O(m), but if you *must* create a third list and preserve the original two, there's no obvious way to improve on O(n+m). — dylrei, Mar 22 '15 at 18:06

Alex Riley · Accepted Answer · 2017-07-23T11:05:40.597

When you concatenate the two lists with A + B, you create a completely new list in memory. This means your guess is correct: the complexity is O(n + m) (where n and m are the lengths of the lists) since Python has to walk both lists in turn to build the new list.

You can see this happening in the list_concat function in the source code for Python lists:

static PyObject *
list_concat(PyListObject *a, PyObject *bb)
{
/* ...code snipped... */
    src = a->ob_item;
    dest = np->ob_item;
    for (i = 0; i < Py_SIZE(a); i++) {     /* walking list a */
        PyObject *v = src[i];
        Py_INCREF(v);
        dest[i] = v;
    }
    src = b->ob_item;
    dest = np->ob_item + Py_SIZE(a);
    for (i = 0; i < Py_SIZE(b); i++) {     /* walking list b */
        PyObject *v = src[i];
        Py_INCREF(v);
        dest[i] = v;
    }
/* ...code snipped... */

If you don't need a new list in memory, it's often a good idea to take advantage of the fact that lists are mutable (and this is where Python is smart). Using A.extend(B) is O(m) in complexity meaning that you avoid the overhead of copying list a.

The complexity of various list operations are listed here on the Python wiki.

score 2 · Answer 2 · edited May 23 '17 at 11:54

My first guess is O(n+m), not sure if Python is smarter than that.

Nothing can be smarter than that while returning a copy. Though even if A, B were immutable sequences such as strings; CPython still makes a full copy instead of aliasing the same memory (it simplifies implementation of the garbage collection for such strings).

In some specific cases, the operation could be O(1) depending on what you want to do with the result e.g., itertools.chain(A, B) allows to iterate over all items (it does not make a copy, the change in A, B affects yielded items). Or if you need a random access; you could emulate it using a Sequence subclass e.g., WeightedPopulation but in the general case the copy and therefore O(n+m) runtime is unavoidable.

score 1 · Answer 3 · edited Sep 04 '22 at 16:34

1

Copying a list is O(n) (with n being the number of elements) and extending is O(k) (with k being the number of elements in the second list). Based on these two facts, I would think it couldn't be any less than O(n+k), since this is a copy and extend operation, and the very least you would need to copy all the elements of both lists.

Source: Python TimeComplexity

edited Sep 04 '22 at 16:34

ARAT

884
1
14
35

answered Mar 22 '15 at 18:01

TheBlackCat

9,791
3
24
31

Runtime of merging two lists in Python

3 Answers3