2

Can you please explain the process that is performed on the background when we access an element of the list by index?

My knowledge of C is pretty limited, thus I cannot fully understand the source code. So far I have identified this function declaration as the relevant one.

Based on this article, my understanding is that a list is an array of pointers. Once the list is created, the list structure contains the length and the allocated memory which is normally greater than size. However, I do not understand the following:

When we access an element, do we make use of length of the list? Or we just access the pointer ob_item[i]?

The following question does not answer this one as it only covers list implementation.

Daria
  • 163
  • 1
  • 2
  • 8
  • 1
    Besides the list range checks, all the essential logic is in the line `return ((PyListObject *)op) -> ob_item[i];`, and `ob_item` is explained in the linked question. – bereal Dec 01 '21 at 08:23
  • Does it mean that in order to return an element we need to know the length of the array to perform those range checks? – Daria Dec 01 '21 at 08:24
  • The size of the list is certainly stored within the interpreter since `__len__` is possible. – Mateen Ulhaq Dec 01 '21 at 08:26
  • I have edited the question to make it a bit more concrete. I do not understand the following: whether when we access an element, do we make use of the length of the list? – Daria Dec 01 '21 at 08:32

2 Answers2

3

Python uses the length of the list you're indexing to check if the index is out of range. It doesn't need the length for the actual indexing step, but C doesn't have any protection against indexing beyond the end of an array, so checking the length first is important.

It's worth noting that the function you linked is not the function that implements indexing from Python code. Rather, it is a C API function that is mostly intended for other C code to use. The function that actually gets called when you index a list from Python code is called list_subscript which you can see further down the source code. That function handles negative indexes (and slices) and then calls another function to do the actual indexing. For integer indexes, it calls list_item, which is almost the same as the more generic PyList_GetItem you were looking at (PyList_GetItem includes a check that confirms the object it's being called on is actually a list).

Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • Thank you for this answer. This makes total sense to me. Does the same stand for reverse indexing? In the list_subscript I can see that if i is negative, then PyList_GET_SIZE(self) is called. Would that imply that for the reverse indexing we need the size ? – Daria Dec 01 '21 at 12:16
  • Based on the lines 2929 and 2930 in the source code you provided, the way it is handled is we add the length of the array to the negative index to retrieve the element. Correct me if I am wrong. – Daria Dec 01 '21 at 12:22
  • Yes, that's right. For negative indexes you really do need the size of the list to get the right index, not only for bounds checking. – Blckknght Dec 02 '21 at 02:09
0

C doesn't check if the index and I assume CPython neither. This means that when you try to access an element in a list, it will be assumed the index is not out of bounds. This means it's up to the programmer to perform those checks using the length of the list. In the code you provide, there is a function valid_index(...) that does this.

  • According to the source code, I believe this (https://github.com/python/cpython/blob/0aa0bd056349f73de9577ccc38560c1d01864d51/Objects/listobject.c#L235-L242) performs the checks for us. Hence, the question: is Py_SIZE(p) essentially the length of the array? Which would imply that the length is used upon element access. – Daria Dec 01 '21 at 08:51
  • 1
    Yes, Py_SIZE(p) returns the size of an object. So in this case the length of the list is used upon element access to perform a bounds check (as good practice dictates). – practical-plankton Dec 01 '21 at 08:59