84

What is the typical underlying data structure used to implement Python's built-in list data type?

Bakuriu
  • 98,325
  • 22
  • 197
  • 231
Nixuz
  • 3,439
  • 4
  • 39
  • 44
  • two options: 1) just curiosity, or 2) premature optimization. – flybywire May 27 '09 at 10:55
  • 2
    Someone else asked me this question and I told them that my intuition was that the implementation was array based but I wasn't sure. This got my curiosity up a bit so I decide to ask. – Nixuz May 27 '09 at 11:30
  • 17
    Believe it or not I did spend a couple minutes googling for the answer and even if I had downloaded the source code, I probably wouldn't know where to start. I figured someone on here would know the answer with minimal effort and it appears I was right. Easy rep for them, fast answer for me, everyone wins. – Nixuz May 27 '09 at 11:40
  • 19
    This is not at all silly. The whole reason why the Python list includes an append() operation but not a prepend() operation is precisely because Guido et al. think that list users need to be quite explicitly aware of the fact that it's an array to which it's easy and efficient to append things but quite expensive to prepend things. – Brandon Rhodes May 27 '09 at 13:11
  • 3
    Possible duplicate of [How is Python's List Implemented?](http://stackoverflow.com/questions/3917574/how-is-pythons-list-implemented) – bain Jul 02 '16 at 12:45

5 Answers5

61

List objects are implemented as arrays. They are optimized for fast fixed-length operations and incur O(n) memory movement costs for pop(0) and insert(0, v) operations which change both the size and position of the underlying data representation.

See also: http://docs.python.org/library/collections.html#collections.deque

Btw, I find it interesting that the Python tutorial on data structures recommends using pop(0) to simulate a queue but does not mention O(n) or the deque option.

http://docs.python.org/tutorial/datastructures.html#using-lists-as-queues

Georg Schölly
  • 124,188
  • 49
  • 220
  • 267
e1i45
  • 1,519
  • 1
  • 14
  • 20
  • 7
    The tutorial existed long long before the deque module, that's why. Report it to bugs.python.org , if possible with a patch for a correct sentence and the tutorial will no longer gives incorrect hints. – Philippe F May 27 '09 at 13:16
  • On several interviews, I have been told that the underlying data structure for lists is linked lists. Is that correct? In that case, for dictionaries also it is linked list. Is that so? – Shruti Kar Jul 02 '19 at 22:32
31

CPython:

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     * list.sort() temporarily sets allocated to -1 to detect mutations.
     *
     * Items must normally not be NULL, except during construction when
     * the list is not yet visible outside the function that builds it.
     */
    Py_ssize_t allocated;
} PyListObject;

As can be seen on the following line, the list is declared as an array of pointers to PyObjects.

PyObject **ob_item;
Georg Schölly
  • 124,188
  • 49
  • 220
  • 267
13

In the Jython implementation, it's an ArrayList<PyObject>.

Hank Gay
  • 70,339
  • 36
  • 160
  • 222
nategood
  • 11,807
  • 4
  • 36
  • 44
2

Although it may be obvious, worth stating that Python lists are Dynamic arrays (as opposed to Static arrays). This is an important distinction that comes up in job interview questions/academia.

Because the array is dynamic, Python reserves an amount of memory upon declaration, eg:

somelist = []

Because extra memory already is set aside, performing somelist.append() simply writes to the next reserved memory slot(s) and hence is O(1) most of the time. For a static array, typically the array is full (ie. if have 4 bytes, then array size is 4) and appends would always be O(n) because they require reserving an entirely new set of memory (maybe 5 bytes now) and copying contents over.

Adam Hughes
  • 14,601
  • 12
  • 83
  • 122
-1

The list is an inbuilt data structure in python. But can be used to create user-defined data structures. Two main user-defined data structures created by lists are stacks and queues.

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jun 30 '22 at 08:50