0

What is the underlying data structure of the three common python data structure: list, tuple and set?

I heard from someone that list in python is implemented as linked list. How about tuple and set? To be specific, how is the true memory of the computer allocated for their implementation?

I tried to time the task of looking for a number in the collection of number and found that tuple shows the best performance. Why is that the case?

from timeit import timeit
timeit('238478 in (x for x in range(1000000))', number = 100) #time ~ 2s
timeit('238478 in {x for x in range(1000000)}', number = 100) #time ~ 7s
timeit('238478 in [x for x in range(1000000)]', number = 100) #time ~ 7s

I tried to read the documentation but it doesn't describe the underlying architecture.

Edit: I want to study the speed of Python vs other languages. For example, array in C surely outperform list in Python. But what if I compare tuple vs array? To study these, I would like to understand more on how these data structures in Python are working in the computer.

Mayan
  • 492
  • 4
  • 11
  • umm, How about set? – Mayan Oct 25 '19 at 09:55
  • Information like that is readily available on the internet. – Trollsors Oct 25 '19 at 10:32
  • `(x for x in range(1000000))` is not a tuple! It's a generator expression which generates values on the fly. Really you should use `238478 in range(1000000)` which does a simple arithmetic computation and will be most performant. – jpp Oct 25 '19 at 10:49
  • The question as it stands makes an incorrect assumption (i.e. you are starting with a `tuple` instead of generator expression). The marked duplicate explains when to use generator expressions vs list comprehensions which is at the heart of the question you *should* be asking. – jpp Oct 25 '19 at 10:52

1 Answers1

0

A tuple is a sequence like a list. The main difference is that a tuple is immutable meaning a value assigned to it cannot be changed afterwards like in a list. i.e:

myset = (123, 'mystring', 456)
myset[1] = 'somethingelse'

the above will throw a TypeError exception.

Now a set is a different thing. A set has not order in its elements. Also the elements in a set cannot be duplicated, they are unique.

Kostas Charitidis
  • 2,991
  • 1
  • 12
  • 23
  • I understand these. But I mean what is the architecture of these data structure? For example, when I learned about array in C, array is assigned in a series of continuous bytes in the memory so that iterating over it is faster than list in python because the elements of the list are stored in different memory. They are linked by stating the address of the previous and the next element. I am curious on how other data structures in Python are implemented. – Mayan Oct 25 '19 at 09:59
  • Okay maybe this will enlighten you a bit: https://stackoverflow.com/questions/46664007/why-do-tuples-take-less-space-in-memory-than-lists . also: https://jakevdp.github.io/PythonDataScienceHandbook/figures/array_vs_list.png – Kostas Charitidis Oct 25 '19 at 10:16