148

I've read several python tutorials (Dive Into Python, for one), and the language reference on Python.org - I don't see why the language needs tuples.

Tuples have no methods compared to a list or set, and if I must convert a tuple to a set or list to be able to sort them, what's the point of using a tuple in the first place?

Immutability?

Why does anyone care if a variable lives at a different place in memory than when it was originally allocated? This whole business of immutability in Python seems to be over emphasized.

In C/C++ if I allocate a pointer and point to some valid memory, I don't care where the address is located as long as it's not null before I use it.

Whenever I reference that variable, I don't need to know if the pointer is still pointing to the original address or not. I just check for null and use it (or not).

In Python, when I allocate a string (or tuple) assign it to x, then modify the string, why do I care if it's the original object? As long as the variable points to my data, that's all that matters.

>>> x='hello'
>>> id(x)
1234567
>>> x='good bye'
>>> id(x)
5432167

x still references the data I want, why does anyone need to care if its id is the same or different?

Glenn Maynard
  • 55,829
  • 10
  • 121
  • 131
pyNewGuy
  • 1,489
  • 2
  • 10
  • 3
  • 13
    you're paying attention to the wrong aspect of mutability: "whether the id is the same or different" is just a side effect; "whether the data pointed to by other references which previously pointed to the same object now reflect updates" is critical. – Charles Duffy Feb 01 '10 at 02:44
  • @Charles Duffy That's why I don't understand how EVER anyone would want immutability. :) – The incredible Jan Nov 08 '22 at 08:05
  • @TheincredibleJan, ...there are some Rich Hickey talks I can point you at, or the entire first half of the book The Joy of Clojure (the first half is "why", the second half is "how"). Python isn't nearly aggressive enough about using immutable data types to gain the full set of benefits, but using a language and programming model centered around immutability makes a lot of things that can go wrong with conventional tools into complete nonissues, particularly in concurrency-heavy environments. – Charles Duffy Nov 08 '22 at 16:27
  • @TheincredibleJan, ...in the context of Python, though, it's largely about ensuring that values are safe to use as hash keys. Remember, if you have a list `['a', 'b', 'c']` and you use it as a key in a dict, _the hash of that list_ is used to look up the location in the dict. But what happens if something else has a copy of that list and changes its contents? Suddenly the dict has the value stored in the wrong place; you can no longer do an amortized-constant-time lookup. – Charles Duffy Nov 08 '22 at 16:29

9 Answers9

132
  1. immutable objects can allow substantial optimization; this is presumably why strings are also immutable in Java, developed quite separately but about the same time as Python, and just about everything is immutable in truly-functional languages.

  2. in Python in particular, only immutables can be hashable (and, therefore, members of sets, or keys in dictionaries). Again, this afford optimization, but far more than just "substantial" (designing decent hash tables storing completely mutable objects is a nightmare -- either you take copies of everything as soon as you hash it, or the nightmare of checking whether the object's hash has changed since you last took a reference to it rears its ugly head).

Example of optimization issue:

$ python -mtimeit '["fee", "fie", "fo", "fum"]'
1000000 loops, best of 3: 0.432 usec per loop
$ python -mtimeit '("fee", "fie", "fo", "fum")'
10000000 loops, best of 3: 0.0563 usec per loop
CDspace
  • 2,639
  • 18
  • 30
  • 36
Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 3
    I don't like the optimization argument because I've never seen a noticeable difference in performance between tuples and lists. However, I'll +1 for the second point. – Sasha Chedygov Feb 01 '10 at 03:48
  • 12
    @musicfreak, see the edit I just did where building a tuple is over 7.6 times faster than building the equivalent list -- now you can't say you've "never seen a noticeable difference" any more, unless your definition of "noticeable" is **truly** peculiar... – Alex Martelli Feb 01 '10 at 04:35
  • 4
    @Alex: I've seen improvements of several microseconds, but is that really going to change the performance of an entire application? I doubt it. I once rewrote part of a real-time application that used tuples extensively to use lists instead, and there was no noticeable performance hit because of it, so I stand by my original statement. – Sasha Chedygov Feb 01 '10 at 05:40
  • 13
    @musicfreak I think you are misusing "premature optimization is the root of all evil". There's a huge difference between doing premature optimization in an application (for example, saying "tuples are faster than lists, so we're going to use only tuples in all the app!") and doing benchmarks. Alex's benchmark is insightful and knowing that building a tuple is faster than building a list might help us in future optimization operations (when it's really needed). – Virgil Dupras Feb 01 '10 at 09:02
  • 7
    @Alex, is "building" a tuple really faster than "building a list", or are we seeing the result of the Python runtime caching the tuple? Seems the latter to me. – Kenan Banks Feb 01 '10 at 12:57
  • 1
    @Tryptych, immutability allows the pre-computing of the literal and "just fetching" it - it's even more than just caching (check w/dis.dis!). @Virgil, building a non-literal tuple is only 2-3 times faster than building a list -- the rest of the advantage comes in this case from using literals. @musicfreak, you do appear to deeply misunderstand Knuth: he never advocated using e.g. 64-bit floats in lieu of 32-bit ints just because in one app you could not "notice" the difference (such subjective criteria as "notice", in fact, he'd reject outright!-). – Alex Martelli Feb 01 '10 at 15:36
  • Supporting the fetching argument, `python -mtimeit -s "import random" "(random.random(), random.random())"` is 0.28 usec for lists, 0.23 usec for tuples. – ACoolie Feb 01 '10 at 18:31
  • 6
    @ACoolie, that's totally dominated by the `random` calls (try doing just that, you'll see!), so not very significant. Try `python -mtimeit -s "x=23" "[x,x]"` and you'll see a more meaningful speedup of 2-3 times for building the tuple vs building the list. – Alex Martelli Feb 01 '10 at 21:23
  • 2
    @Tryptych, If you are worried about the tuple being cached, compare `python -m timeit -s "x=0" "x+=1;(x,x)"` and `python -m timeit -s "x=0" "x+=1;[x,x]"` (subtract `python -m timeit -s "x=0" "x+=1"` to remove that component) – John La Rooy Feb 01 '10 at 22:01
  • 3
    suddenly i have a lot of "constant" sequences that want to be tuples – Matt Joiner Feb 02 '10 at 02:27
  • 10
    for anyone wondering -- we were able to shave off over an hour of data processing by switching from lists to tuples. – Mark Jul 01 '13 at 20:34
  • Very helpful answer, thanks, but I don't get the easier hashability. What's wrong with hashing a list? If later on the list changes, the dictionary key can still remain the hash of the previous list... . – aderchox May 02 '20 at 15:31
  • @aderchox: see "[Why can't I use a list as a dict key in python?](https://stackoverflow.com/q/7257588/90527)" – outis Oct 24 '21 at 12:05
42

None of the answers above point out the real issue of tuples vs lists, which many new to Python seem to not fully understand.

Tuples and lists serve different purposes. Lists store homogenous data. You can and should have a list like this:

["Bob", "Joe", "John", "Sam"]

The reason that is a correct use of lists is because those are all homogenous types of data, specifically, people's names. But take a list like this:

["Billy", "Bob", "Joe", 42]

That list is one person's full name, and their age. That isn't one type of data. The correct way to store that information is either in a tuple, or in an object. Lets say we have a few :

[("Billy", "Bob", "Joe", 42), ("Robert", "", "Smith", 31)]

The immutability and mutability of Tuples and Lists is not the main difference. A list is a list of the same kind of items: files, names, objects. Tuples are a grouping of different types of objects. They have different uses, and many Python coders abuse lists for what tuples are meant for.

Please don't.


Edit:

I think this blog post explains why I think this better than I did:

Grant Paul
  • 5,852
  • 2
  • 33
  • 36
  • 20
    I think you have a vision which is not agreed at least by me, don't know the others. – Stefano Borini Feb 01 '10 at 01:54
  • 19
    I also strongly disagree with this answer. The homogeneity of the data has absolutely nothing to do with whether you should use a list or a tuple. Nothing in Python suggests this distinction. – Glenn Maynard Feb 01 '10 at 02:11
  • You can store everything, and anything in a list, and all native python containers. – pyNewGuy Feb 01 '10 at 02:23
  • 14
    Guido made this point a few years ago too. http://aspn.activestate.com/ASPN/Mail/Message/python-list/1566320 – John La Rooy Feb 01 '10 at 02:33
  • 3
    This is a valid point in almost any other language with tuples, but it doesn't apply to Python. The language itself does not restrict the use of lists to only homogeneous types of data, so you're free to use them in any way you want to. If they were meant to be used in this specific way, why doesn't the language enforce it? Either way, your answer isn't really valid. – Sasha Chedygov Feb 01 '10 at 03:54
  • 14
    Even though Guido (the designer of Python) intended for lists to be used for homogeneous data and tuples for heterogeneous, the fact is that the language doesn't enforce this. Therefore, I think this interpretation is more of a style issue than anything else. It so happens that in many people's typical use cases, lists tend to be array-like and tuples tend to be record-like. But this shouldn't stop people from using lists for heterogeneous data if it suits their problem better. As the Zen of Python says: Practicality beats purity. – John Y Feb 01 '10 at 05:36
  • +1: purely for bringing this interpretation of tuples and lists to my attention. I've perhaps had too low level an understanding of them until now. – James Hopkin Feb 01 '10 at 11:56
  • I revise my statement and I'm voting you up (I didn't vote you down originally). I learned something new from your answer. Still don't agree fully, but you raised an interesting point nevertheless. – Stefano Borini Feb 01 '10 at 12:10
  • 1
    The accepted answer here makes a similar point: http://stackoverflow.com/questions/1708510/python-list-vs-tuple-when-to-use-each – James Hopkin Feb 01 '10 at 12:52
  • 9
    @Glenn, you're basically wrong. One of the chief uses of tuples is as a composite data type for storing multiple pieces of data that are related. The fact that you can iterate over a tuple and perform many of the same operations does not change this. (As reference consider that tuples in many other languages do not have the same iterable features as their list counterparts) – HS. Feb 01 '10 at 15:47
  • No, that has nothing to do with homogeneity. It's perfectly normal to use an array for varying data types if you have a need to modify them, or a tuple for homogenous data types if you want immutability. `items = []; items.append("text"); items.append(100); for i in items: print i` is perfectly valid, and a tuple couldn't do that at all. – Glenn Maynard Feb 03 '10 at 08:02
  • 1
    Glenn, it is valid code, but it is incorrect style. If the index matters, use a tuple, if the index does not refer to a particular element, use a list. If you need mutability, then create a MutableTuple, or whatever. – Grant Paul Feb 04 '10 at 02:54
  • 7
    I voted this down for one reason. "The reason that is a correct use of lists is because those are all homogenous types of data" and "The correct way to store that information is either in a tuple" are both assertions that you give no reasons for. As far as I can tell an equivalent answer could reverse tuple and lists and convey the same amount of information. – Dustin Wyatt Feb 05 '10 at 18:33
  • 1
    Therms: the reason that it is valid is because that is how the language was designed and implemented. Look above, it links to Guido talking about the exact same thing. Tuples are not specific to python, and in more functional languages where they are used more often the distinction is more obvious, but there is that same distinction in python, it is just not as enforced. A classic example: A row in a database isn't a list. A tuple is a grouping of related objects into a single object, such as a database row. A database, then is a list of tuples, not a tuple of lists. – Grant Paul Feb 06 '10 at 00:52
  • 3
    This answer explains nothing. Imagine someone tells you to put your socks on before your shoes, and you ask them why. They respond, "It's the right way, plus Shakespeare, Einstein and Rembrandt all did it that way, you're not smarter than them are you?" At a level of basic logic this response fails by (1) begging the question, a fallacy wherein an assertion is used as evidence that the same assertion is correct, and (2) argument by authority, using the views of an expert as evidence the views are correct. The expert's reasons for holding a view matter, not the fact that he holds it. – Chris May 02 '12 at 17:57
  • 1
    @chpwn: "the reason that it is valid is because that is how the language was designed and implemented." This quote demonstrates why your answer is wrong. Python allows lists to be heterogeneous and tuples to be homogenous. The language implementation contradicts your answer. There are good reasons for following your suggestion as a matter of style. Use the good reasons instead of sticking with the bad ones. – Chris May 02 '12 at 18:31
  • Lists are basically linked lists/vectors (data and length mutable), and tuples are basically const arrays (data and length immutable). Both can and should be able to have heterogenous data, in order to be able to solve real world problems. Guido often says many things about Python which make no sense in the real world, creator of the language or not. Language creators are not infallible; we will often make assumptions which are completely oblivious. – Mark Jul 01 '13 at 20:30
  • @grant fields in a db row are editable; so they are nothing like tuples. if anything, the corresponding python base object would be a dict (named fields with editable data). a db would be a list of dicts -- an editable object containing editable objects. depending on implementation details, a list of lists or dict of dicts could be faster though. – Mark Jul 01 '13 at 20:33
  • 1
    Lists could also contain different types of objects! – LookIntoEast Mar 22 '17 at 18:55
  • 1
    I down-voted because this completely fails to address the immutability aspect. – sferencik Nov 25 '17 at 22:08
26

if I must convert a tuple to a set or list to be able to sort them, what's the point of using a tuple in the first place?

In this particular case, there probably isn't a point. This is a non-issue, because this isn't one of the cases where you'd consider using a tuple.

As you point out, tuples are immutable. The reasons for having immutable types apply to tuples:

  • copy efficiency: rather than copying an immutable object, you can alias it (bind a variable to a reference)
  • comparison efficiency: when you're using copy-by-reference, you can compare two variables by comparing location, rather than content
  • interning: you need to store at most one copy of any immutable value
  • there's no need to synchronize access to immutable objects in concurrent code
  • const correctness: some values shouldn't be allowed to change. This (to me) is the main reason for immutable types.

Note that a particular Python implementation may not make use of all of the above features.

Dictionary keys must be immutable, otherwise changing the properties of a key-object can invalidate invariants of the underlying data structure. Tuples can thus potentially be used as keys. This is a consequence of const correctness.

See also "Introducing tuples", from Dive Into Python.

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
outis
  • 75,655
  • 22
  • 151
  • 221
  • 2
    id((1,2,3))==id((1,2,3)) is false. You can't compare tuples just by comparing the location, because there's no guarantee that they were copied by reference. – Glenn Maynard Feb 01 '10 at 01:39
  • @Glenn: Note the qualifying remark "when you're using copy-by-reference". While the coder can create their own implementation, copy-by-reference for tuples is largely a matter for the interpreter/compiler. I was mostly referring to how `==` is implemented at the platform level. – outis Feb 01 '10 at 03:40
  • 1
    @Glenn: also note that copy-by-reference doesn't apply to the tuples in `(1,2,3) == (1,2,3)`. That's more a matter of interning. – outis Feb 01 '10 at 03:56
  • Like I said rather clearly, *there's no guarantee that they were copied by reference*. Tuples aren't interned in Python; that's a string concept. – Glenn Maynard Feb 01 '10 at 21:57
  • Like I said very clearly: I'm not talking about the programmer comparing tuples by comparing location. I'm talking about the possibility that the platform can, which can guarantee copy-by-reference. Also, interning can be applied to any immutable type, not only strings. The main Python implementation may not intern immutable types, but the fact Python has immutable types makes interning an option. – outis Feb 01 '10 at 23:07
  • Also, `id((1,2,3)) != id((1,2,3))` demonstrates tuples aren't interned; it doesn't demonstrate tuples don't use copy-by-reference. – outis Feb 01 '10 at 23:38
  • To say why Python has tuples, we have to put on our language designer hat and consider both the language users and implementors. Tuples are implemented by the language implementors; copy-by-reference, strict equality and interning of tuples are thus all concerns of the implementors, not Python users. The more powerful languages allow for extension by the users, so users can apply these features to their own datatypes, but that's not germane to tuples. The efficiency afforded by these implementation details, along with synchronization and const correctness, is what's relevant to users. – outis Feb 01 '10 at 23:43
15

Sometimes we like to use objects as dictionary keys

For what it's worth, tuples recently (2.6+) grew index() and count() methods

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • 5
    +1: A mutable list (or mutable set or mutable dictionary) as a dictionary key can't work. So we need immutable lists ("tuples"), frozen sets, and ... well ... a frozen dictionary, I suppose. – S.Lott Feb 01 '10 at 01:23
9

I've always found having two completely separate types for the same basic data structure (arrays) to be an awkward design, but not a real problem in practice. (Every language has its warts, Python included, but this isn't an important one.)

Why does anyone care if a variable lives at a different place in memory than when it was originally allocated? This whole business of immutability in Python seems to be over emphasized.

These are different things. Mutability isn't related to the place it's stored in memory; it means the stuff it points to can't change.

Python objects can't change location after they're created, mutable or not. (More accurately, the value of id() can't change--same thing, in practice.) The internal storage of mutable objects can change, but that's a hidden implementation detail.

>>> x='hello'
>>> id(x)
1234567
>>> x='good bye'
>>> id(x)
5432167

This isn't modifying ("mutating") the variable; it's creating a new variable with the same name, and discarding the old one. Compare to a mutating operation:

>>> a = [1,2,3]
>>> id(a)
3084599212L
>>> a[1] = 5
>>> a
[1, 5, 3]
>>> id(a)
3084599212L

As others have pointed out, this allows using arrays as keys to dictionaries, and other data structures that need immutability.

Note that keys for dictionaries do not have to be completely immutable. Only the part of it used as a key needs to be immutable; for some uses, this is an important distinction. For example, you could have a class representing a user, which compares equality and a hash by the unique username. You could then hang other mutable data on the class--"user is logged in", etc. Since this doesn't affect equality or the hash, it's possible and perfectly valid to use this as a key in a dictionary. This isn't too commonly needed in Python; I just point it out since several people have claimed that keys need to be "immutable", which is only partially correct. I've used this many times with C++ maps and sets, though.

Glenn Maynard
  • 55,829
  • 10
  • 121
  • 131
  • >>> a = [1,2,3] >>> id(a) 3084599212L >>> a[1] = 5 >>> a [1, 5, 3] >>> id(a) 3084599212L You've just modified a mutable data type, so it doesn't make sense- related to the original question. x='hello" id(x) 12345 x="goodbye" id(x) 65432 Who cares if it is a new object or not. As long as x points to the data I've assigned, that's all that matters. – pyNewGuy Feb 01 '10 at 02:25
  • 5
    You're confused well beyond my ability to help you. – Glenn Maynard Feb 01 '10 at 03:03
  • +1 for pointing out confusion in the sub-questions, which seem to be the main source of difficulty in perceiving the value of tuples. – outis Feb 01 '10 at 04:07
  • 1
    If I could, another +1 for pointing out that the true rubric for keys is whether or not the object is hashable (http://docs.python.org/glossary.html#term-hashable). – outis Feb 01 '10 at 04:14
9

As gnibbler offered in a comment, Guido had an opinion that is not fully accepted/appreciated: “lists are for homogeneous data, tuples are for heterogeneous data”. Of course, many of the opposers interpreted this as meaning that all elements of a list should be of the same type.

I like to see it differently, not unlike others also have in the past:

blue= 0, 0, 255
alist= ["red", "green", blue]

Note that I consider alist to be homogeneous, even if type(alist[1]) != type(alist[2]).

If I can change the order of the elements and I won't have issues in my code (apart from assumptions, e.g. “it should be sorted”), then a list should be used. If not (like in the tuple blue above), then I should use a tuple.

tzot
  • 92,761
  • 29
  • 141
  • 204
6

They are important since they guarantee the caller that the object they pass won't be mutated. If you do this:

a = [1,1,1]
doWork(a)

The caller has no guarantee of the value of a after the call. However,

a = (1,1,1)
doWorK(a)

Now you as the caller or as a reader of this code know that a is the same. You could always for this scenario make a copy of the list and pass that but now you are wasting cycles instead of using a language construct that makes more semantic sense.

Matthew Manela
  • 16,572
  • 3
  • 64
  • 66
  • 2
    This is a very secondary property of tuples. There are too many cases where you have a mutable object you want to pass to a function and not have it modified, whether it's a preexisting list or some other class. There's just no concept of "const parameters by reference" in Python (eg. const foo & in C++). Tuples happen to give you this if it happens to be convenient to use a tuple at all, but if you've received a list from your caller, are you really going to convert it to a tuple before passing it somewhere else? – Glenn Maynard Feb 01 '10 at 02:08
  • I agree with you on that. A tuple isn't the same as slapping on a const keyword. My point is that the immutability of a tuple carries added meaning to the reader of the code. Given a situation where both would work and your expectation is that it shouldn't change using the tuple will add that extra meaning for the reader (while ensuring it as well) – Matthew Manela Feb 01 '10 at 02:15
  • a = [1,1,1] doWork(a) if dowork() is defined as def dowork(arg): arg=[0,0,0] calling dowork() on a list or tuple has the same result – pyNewGuy Feb 01 '10 at 02:29
1

you can see here for some discussion on this

Glenn Maynard
  • 55,829
  • 10
  • 121
  • 131
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
1

Your question (and follow-up comments) focus on whether the id() changes during an assignment. Focusing on this follow-on effect of the difference between immutable object replacement and mutable object modification rather than the difference itself is perhaps not the best approach.

Before we continue, make sure that the behavior demonstrated below is what you expect from Python.

>>> a1 = [1]
>>> a2 = a1
>>> print a2[0]
1
>>> a1[0] = 2
>>> print a2[0]
2

In this case, the contents of a2 was changed, even though only a1 had a new value assigned. Contrast to the following:

>>> a1 = (1,)
>>> a2 = a1
>>> print a2[0]
1
>>> a1 = (2,)
>>> print a2[0]
1

In this latter case, we replaced the entire list, rather than updating its contents. With immutable types such as tuples, this is the only behavior allowed.

Why does this matter? Let's say you have a dict:

>>> t1 = (1,2)
>>> d1 = { t1 : 'three' }
>>> print d1
{(1,2): 'three'}
>>> t1[0] = 0  ## results in a TypeError, as tuples cannot be modified
>>> t1 = (2,3) ## creates a new tuple, does not modify the old one
>>> print d1   ## as seen here, the dict is still intact
{(1,2): 'three'}

Using a tuple, the dictionary is safe from having its keys changed "out from under it" to items which hash to a different value. This is critical to allow efficient implementation.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • As others have pointed out, immutability != hashability. Not all tuples can be used as dictionary keys: { ([1], [2]) : 'value' } fails because the mutable lists in the tuple can be altered, but { ((1), (2)) : 'value' } is OK. – Ned Deily Feb 01 '10 at 07:39
  • Ned, that's true, but I'm not sure that the distinction is germane to the question being asked. – Charles Duffy Feb 02 '10 at 09:30
  • @K.Nicholas, the edit you approved here changed the code in such a way as to be assigning an integer, not a tuple, at all -- making the later index operations fail, so they couldn't possibly have tested that the new transcript was actually possible. Correctly-identified problem, sure; invalid solution. – Charles Duffy Oct 19 '18 at 13:30
  • @MichaelPuckettII, likewise, see above. – Charles Duffy Oct 19 '18 at 13:31