25

I understand that a list is different from an array. But still, O(1)? That would mean accessing an element in a list would be as fast as accessing an element in a dict, which we all know is not true. My question is based on this document:

list

----------------------------
| Operation | Average Case |
|-----------|--------------|
|    ...    |     ...      |
|-----------|--------------|
|  Get Item |     O(1)     |
----------------------------

and this answer:

Lookups in lists are O(n), lookups in dictionaries are amortized O(1), with regard to the number of items in the data structure.

If the first document is true, then why is accessing a dict faster than accessing a list if they have the same complexity?

Can anybody give a clear explanation on this please? I would say it always depends on the size of the list/dict, but I need more insight on this.

Community
  • 1
  • 1
Diego Mora Cespedes
  • 3,605
  • 5
  • 26
  • 33
  • 1
    `we all know is not true` do we now? `a list is different from an array` is it really? (you are confusing lookup and arbitrary item access by index, which is what "get item" means) – njzk2 May 20 '16 at 15:35
  • 4
    Accessing an element at a particular index in list is *O(1)*, but searching for an element in a list is *O(N)*. – Vedang Mehta May 20 '16 at 15:36
  • 1
    I think the first doc talks about accessing a list element by index, the second about testing for membership (in list or a hash-backed set). Those are different things. – pvg May 20 '16 at 15:38
  • 2
    Python's list **is** array in C terms. It's rather like C++ arrays or vectors than the (linked) lists. – robyschek May 20 '16 at 15:46

4 Answers4

36

Get item is getting an item in a specific index, while lookup means searching if some element exists in the list. To do so, unless the list is sorted, you will need to iterate all elements, and have O(n) Get Item operations, which leads to O(n) lookup.

A dictionary is maintaining a smart data structure (hash table) under the hood, so you will not need to query O(n) times to find if the element exists, but a constant number of times (average case), leading to O(1) lookup.

pvg
  • 2,673
  • 4
  • 17
  • 31
amit
  • 175,853
  • 27
  • 231
  • 333
  • 3
    Indeed, I mixed "get" and "lookup", which are two very different things. Everything is clear now. Thanks! – Diego Mora Cespedes Jun 12 '16 at 12:11
  • It's not O(1) by the computer science definition – Dave Crooke Oct 12 '17 at 22:09
  • @DaveCrooke Yes it is, it is O(1) average case. Note that `O(f(n))` doesn't always refer to "worst case", which what I think you are confusing with, and can be applied to all analysis types, including average case. – amit Oct 13 '17 at 09:50
  • Dict keys are stored in hash table. Multiple items can have the same hash value. So how is dict lookup said to be O(1) please? – variable Aug 17 '20 at 16:51
  • This does not answer op's exact question "Why is a list access O(1) in Python?" – oeter Dec 08 '21 at 20:11
  • @oeter `Get item is getting an item in a specific index` (in the answer). This is the reason, and later answer explains why it's different than lookup, which is O(n). – amit Dec 09 '21 at 17:22
  • @variable Ideally you don't have too many collisions, which is why it's O(1) (average and amortized). In the worst case, where there are multiple hash collisions, it is O(n). You can find a detailed explanation [here](https://stackoverflow.com/questions/9214353/hash-table-runtime-complexity-insert-search-and-delete) – HellSaint Feb 04 '22 at 02:02
  • no, lookup doesn't mean searching through a list. lookup is the same as "get" in a lot of english technical jargon. this answer shouldn't be the accepted answer – user1689987 May 13 '23 at 15:07
27

accessing a list l at index n l[n] is O(1) because it is not implemented as a Vanilla linked list where one needs to jump between pointers (value, next-->) n times to reach cell index n.

If the memory is continuous and the entry size would had been fixed, reaching a specific entry would be trivial as we know to jump n times an entry size (like classic arrays in C).

However, since list is variable in entries size, the python implementation uses a continuous memory list just for the pointers to the values. This makes indexing a list (l[n]) an operation whose cost is independent of the size of the list or the value of the index.

For more information see http://effbot.org/pyfaq/how-are-lists-implemented.htm

Hanan Shteingart
  • 8,480
  • 10
  • 53
  • 66
10

That is because Python stores the address of each node of a list into a separate array. When we want an element at nth node, all we have to do is look up the nth element of the address array which gives us the address of nth node of the list by which we can get the value of that node in O(1) complexity.

Python does some neat tricks to make these arrays expandable as the list grows. Thus we are getting the flexibility of lists and and the speed of arrays. The trade off here is that the compiler has to reallocate memory for the address array whenever the list grows to a certain extent.

amit has explained in his answer to this question about why lookups in dictionaries are faster than in lists.

Faihan
  • 113
  • 1
  • 9
-2

The minimum possible I can see is O(log n) from a strict computer science standpoint

Dave Crooke
  • 670
  • 1
  • 6
  • 12
  • Well, the question is specifically about Python. – vaultah Oct 12 '17 at 22:18
  • This is wrong, O(1) the average time complexity of hash tables in particular, and of python's dictionary specifically. I suggest reading some about [hash tables](https://en.wikipedia.org/wiki/Hash_table), and [average case analysis](https://en.wikipedia.org/wiki/Average-case_complexity) if you are unfamiliar with them. – amit Oct 13 '17 at 09:49
  • I have PhD in Comp Sci. You are making the same mistake as those who claim radix sort is O(1). Here's the reality .... for the keys to be unique, the minimum length is O(log n) and hence it takes O(log n) to match one, If they are not, your hash buckets contain more than one item. As I said, you are thinking at the developer level, big O() is computer science and it is mathematically impossible to have an O(1) data striucture, – Dave Crooke Oct 14 '17 at 03:05
  • I have PhD in Comp Sci. You are making the same mistake as those who claim radix sort is O(1), as I once did. Here's the reality .... for the keys to be unique, the minimum length is O(log n) and hence it takes O(log n) to match one, If they are not, your hash buckets contain more than one item. As I said, you are thinking at the developer level, big O() is computer science and it is mathematically impossible to have an O(1) data striucture of any kind. Those of you whoi have Comp Sci undergrad degrees may remember the proof that sorting unique keys is Omega(n log n). – Dave Crooke Oct 14 '17 at 03:12
  • 2
    I agree this is strictly speaking true, from a theoretical point of view. But the practical importance of complexity calculations is in the impact on real world performance. In the real world, data structures have a finite feasible size and fixed size keys of a modest length are adequate for that size (eg 64-bit keys are good for more than 10**19 elements) and the main interest is in how computation time scales for examples for which these fixed sized keys suffice. At the hardware level there may be 64 bit parallelism, which removes any advantage in speed for using smaller keys. [cont. ...] – Elroch Nov 18 '17 at 14:24
  • 2
    [... cont.] If 64 bits is not enough, 128 bits should cope with all the computer data in the world for quite a while, and a factor of no more than 2 for all feasible examples is well-described as O(1) for all practical purposes. – Elroch Nov 18 '17 at 14:26