Is it a linked list, an array? I searched around and only found people guessing. My C knowledge isn't good enough to look at the source code.

- 1,048,767
- 296
- 4,058
- 3,343

- 45,306
- 89
- 231
- 297
-
[According to docs](https://docs.python.org/3/faq/design.html#how-are-lists-implemented-in-cpython) , Python Lists are not Linked lists. They are variable size arrays. They are also mutable. I am not sure if it really implements a logical and a real capacity (which would make it a complete [dynamic array](https://en.wikipedia.org/wiki/Dynamic_array) . So you can say it is a unique data structure of its own. (Although I really believe its a Dynamic Array) – Jdeep Dec 02 '20 at 08:00
10 Answers
The C code is pretty simple, actually. Expanding one macro and pruning some irrelevant comments, the basic structure is in listobject.h
, which defines a list as:
typedef struct {
PyObject_HEAD
Py_ssize_t ob_size;
/* Vector of pointers to list elements. list[0] is ob_item[0], etc. */
PyObject **ob_item;
/* ob_item contains space for 'allocated' elements. The number
* currently in use is ob_size.
* Invariants:
* 0 <= ob_size <= allocated
* len(list) == ob_size
* ob_item == NULL implies ob_size == allocated == 0
*/
Py_ssize_t allocated;
} PyListObject;
PyObject_HEAD
contains a reference count and a type identifier. So, it's a vector/array that overallocates. The code for resizing such an array when it's full is in listobject.c
. It doesn't actually double the array, but grows by allocating
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
new_allocated += newsize;
to the capacity each time, where newsize
is the requested size (not necessarily allocated + 1
because you can extend
by an arbitrary number of elements instead of append
'ing them one by one).
See also the Python FAQ.

- 495
- 1
- 7
- 10

- 355,277
- 75
- 744
- 836
-
13So, when iterating over python lists it's as slow as linked lists, because every entry is just a pointer, so every element most likely would cause a cache miss. – Kr0e Apr 10 '14 at 15:24
-
18@Kr0e: not if subsequent elements are actually the same object :) But if you need smaller/cache-friendlier data structures, the `array` module or NumPy are to be preferred. – Fred Foo Apr 11 '14 at 08:55
-
6@Kr0e I wouldn't say iterating over the list is as slow as linked lists, but that iterating over the __values__ of the linked lists is a slow as a linked list, with the caveat that Fred mentioned. For example, iterating over a list to copy it to another one should be faster than a linked list. – Dan Ganea Dec 18 '19 at 22:49
It's a dynamic array. Practical proof: Indexing takes (of course with extremely small differences (0.0013 µsecs!)) the same time regardless of index:
...>python -m timeit --setup="x = [None]*1000" "x[500]"
10000000 loops, best of 3: 0.0579 usec per loop
...>python -m timeit --setup="x = [None]*1000" "x[0]"
10000000 loops, best of 3: 0.0566 usec per loop
I would be astounded if IronPython or Jython used linked lists - they would ruin the performance of many many widely-used libraries built on the assumption that lists are dynamic arrays.

- 260,549
- 28
- 431
- 505
-
1Nice proof of concept. My results for `x[0]` and `x[500]` weren't exactly equal, but they were very close (close enough to be rounding error or just differences in CPU usage). By the way, my CPU is better than yours :) – Rafe Kettler Oct 12 '10 at 19:51
-
4@Ralf: I know my CPU (most other hardware too, for that matter) is old and dog slow - on the bright side, I can assume that code that runs fast enough for me is fast enough for all users :D – Oct 12 '10 at 19:58
-
1
-
2@John: Aaah yes, good catch... fixed and edited it (also now using len-1 (999) as first index). – Oct 12 '10 at 21:21
-
1@delnan: A smarter linked-list implementation of anything might keep pointers to both the first element and the last element. 999 is less "proof" than 500. – John Machin Oct 13 '10 at 22:23
-
38Shows that it's not a naive implementation of a linked list. Doesn't definitively show that it's an array. – Michael Mior Oct 14 '10 at 05:15
-
13You can read about it here : http://docs.python.org/2/faq/design.html#how-are-lists-implemented – CCoder Sep 03 '13 at 09:39
-
11There are far more structures than just linked list and array, timing is of no practical use for deciding between them. – Ross Hemsley Jul 01 '14 at 06:24
-
See the docs https://docs.python.org/3/faq/design.html#how-are-lists-implemented, https://docs.python.org/2/faq/design.html#how-are-lists-implemented – ravi77o Nov 10 '17 at 17:39
-
1Strictly speaking the above does *not* proof how lists are implemented in Python, it specifies how lists are implemented in a *specific* interpreter. – Willem Van Onsem Oct 02 '18 at 18:41
-
@ravi77o it seems that the link you provided doesn't lead to the exact section anymore (probably due to new version of documentation) the updated exact link is now: https://docs.python.org/3/faq/design.html#how-are-lists-implemented-in-cpython – Amitay Drummer Jun 01 '22 at 10:00
I would suggest Laurent Luce's article "Python list implementation". Was really useful for me because the author explains how the list is implemented in CPython and uses excellent diagrams for this purpose.
List object C structure
A list object in CPython is represented by the following C structure.
ob_item
is a list of pointers to the list elements. allocated is the number of slots allocated in memory.typedef struct { PyObject_VAR_HEAD PyObject **ob_item; Py_ssize_t allocated; } PyListObject;
It is important to notice the difference between allocated slots and the size of the list. The size of a list is the same as
len(l)
. The number of allocated slots is what has been allocated in memory. Often, you will see that allocated can be greater than size. This is to avoid needing callingrealloc
each time a new elements is appended to the list.
...
Append
We append an integer to the list:
l.append(1)
. What happens?
We continue by adding one more element:
l.append(2)
.list_resize
is called with n+1 = 2 but because the allocated size is 4, there is no need to allocate more memory. Same thing happens when we add 2 more integers:l.append(3)
,l.append(4)
. The following diagram shows what we have so far.
...
Insert
Let’s insert a new integer (5) at position 1:
l.insert(1,5)
and look at what happens internally.
...
Pop
When you pop the last element:
l.pop()
,listpop()
is called.list_resize
is called insidelistpop()
and if the new size is less than half of the allocated size then the list is shrunk.You can observe that slot 4 still points to the integer but the important thing is the size of the list which is now 4. Let’s pop one more element. In
list_resize()
, size – 1 = 4 – 1 = 3 is less than half of the allocated slots so the list is shrunk to 6 slots and the new size of the list is now 3.You can observe that slot 3 and 4 still point to some integers but the important thing is the size of the list which is now 3.
...
Remove Python list object has a method to remove a specific element:
l.remove(5)
.
-
Thanks, I understand the **link part** of the list more now. Python list is a `aggregation`, not `composition`. I wish there were a list of composition too. – KRoy Sep 10 '18 at 22:30
This is implementation dependent, but IIRC:
- CPython uses an array of pointers
- Jython uses an
ArrayList
- IronPython apparently also uses an array. You can browse the source code to find out.
Thus they all have O(1) random access.

- 83,810
- 28
- 209
- 234
-
2Implementation dependent as in a python interpreter which implemented lists as linked lists would be a valid implementation of the python language? In other words: O(1) random access into lists is not guaranteed? Doesn't that make it impossible to write efficient code without relying on implementation details? – sepp2k Oct 12 '10 at 18:02
-
3@sepp I believe lists in Python are just ordered collections; the implementation and/or performance requirements of said implementation are not explicitly stated – NullUserException Oct 12 '10 at 18:05
-
7@sppe2k: Since Python doesn't really have a standard or formal specification (though there are some pieces of documentations that say "... is guaranteed to ..."), you can't be 100% sure as in "this is guaranteed by some piece of paper". But since `O(1)` for list indexing is a pretty common and valid assumption, no implementation would dare to break it. – Oct 12 '10 at 18:07
-
@Paul It says nothing about how the underlying implementation of lists should be done. – NullUserException Oct 12 '10 at 21:40
-
1It just doesn't happen to specify the big O running time of things. Language syntax specification doesn't necessarily mean the same thing as implementation details, it just happens to often be the case. – Paul McMillan Oct 12 '10 at 21:40
In CPython, lists are arrays of pointers. Other implementations of Python may choose to store them in different ways.

- 507,862
- 82
- 626
- 550
According to the documentation,
Python’s lists are really variable-length arrays, not Lisp-style linked lists.

- 358
- 4
- 9
As others have stated above, the lists (when appreciably large) are implemented by allocating a fixed amount of space, and, if that space should fill, allocating a larger amount of space and copying over the elements.
To understand why the method is O(1) amortized, without loss of generality, assume we have inserted a = 2^n elements, and we now have to double our table to 2^(n+1) size. That means we're currently doing 2^(n+1) operations. Last copy, we did 2^n operations. Before that we did 2^(n-1)... all the way down to 8,4,2,1. Now, if we add these up, we get 1 + 2 + 4 + 8 + ... + 2^(n+1) = 2^(n+2) - 1 < 4*2^n = O(2^n) = O(a) total insertions (i.e. O(1) amortized time). Also, it should be noted that if the table allows deletions the table shrinking has to be done at a different factor (e.g 3x)

- 5,293
- 3
- 26
- 23
-
As far as I understand, there is no copying of older elements. More space is allocated, but the new space is not contiguous with the space already being utilised, and only the newer element(s) to be inserted are copied to the new space. Please correct me if I am wrong. – Tushar Vazirani Feb 19 '20 at 13:55
-
@TusharVazirani I would assume that CPython has to copy the older elements to a new allocated space. Otherwise, the `PyObject **ob_item;` in Fred Foos answer at the top would not work, because it would have to point to the old memory space for old elements and to the new memory space for new elements. – Kilian Obermeier Feb 12 '23 at 15:59
A list in Python is something like an array, where you can store multiple values. List is mutable that means you can change it. The more important thing you should know, when we create a list, Python automatically creates a reference_id for that list variable. If you change it by assigning others variable the main list will be change. Let's try with a example:
list_one = [1,2,3,4]
my_list = list_one
#my_list: [1,2,3,4]
my_list.append("new")
#my_list: [1,2,3,4,'new']
#list_one: [1,2,3,4,'new']
We append my_list
but our main list has changed. That mean's list didn't assign as a copy list assign as its reference.
I found this article really helpful to understand how lists are implemented using python code.
class OhMyList:
def __init__(self):
self.length = 0
self.capacity = 8
self.array = (self.capacity * ctypes.py_object)()
def append(self, item):
if self.length == self.capacity:
self._resize(self.capacity*2)
self.array[self.length] = item
self.length += 1
def _resize(self, new_cap):
new_arr = (new_cap * ctypes.py_object)()
for idx in range(self.length):
new_arr[idx] = self.array[idx]
self.array = new_arr
self.capacity = new_cap
def __len__(self):
return self.length
def __getitem__(self, idx):
return self.array[idx]

- 782
- 2
- 13
- 25
In CPython list is implemented as dynamic array, and therefore when we append at that time not only one macro is added but some more space is allocated so that everytime new space should not be added.

- 415
- 5
- 12