48

From what I've been aware of, using [], {} or () to instantiate objects returns a new instance of list, dict or tuple respectively; a new instance object with a new identity.

This was pretty clear to me until I actually tested it and I noticed that () is () actually returns True instead of the expected False:

>>> () is (), [] is [], {} is {}
(True, False, False)

as expected, this behavior is also manifested when creating objects with list(), dict() and tuple() respectively:

>>> tuple() is tuple(), list() is list(), dict() is dict()
(True, False, False)

The only relevant piece of information I could find in the docs for tuple() states:

[...] For example, tuple('abc') returns ('a', 'b', 'c') and tuple([1, 2, 3]) returns (1, 2, 3). If no argument is given, the constructor creates a new empty tuple, ().

Suffice to say, this isn't sufficient for answering my question.

So, why do empty tuples have the same identity whilst others like lists or dictionaries do not?

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253

1 Answers1

54

In short:

Python internally creates a C list of tuple objects whose first element contains the empty tuple. Every time tuple() or () is used, Python will return the existing object contained in the aforementioned C list and not create a new one.

Such mechanism does not exist for dict or list objects which are, on the contrary, recreated from scratch every time.

This is most likely related to the fact that immutable objects (like tuples) cannot be altered and, as such, are guaranteed to not change during execution. This is further solidified when considering that frozenset() is frozenset() returns True; like () an empty frozenset is considered an singleton in the implementation of CPython. With mutable objects, such guarantees are not in place and, as such, there's no incentive to cache their zero element instances (i.e their contents could change with the identity remaining the same).

Take note: This isn't something one should depend on, i.e one shouldn't consider empty tuples to be singletons. No such guarantees are explicitly made in the documentation so one should assume it is implementation dependent.


How it is done:

In the most common case, the implementation of CPython is compiled with two macros PyTuple_MAXFREELIST and PyTuple_MAXSAVESIZE set to positive integers. The positive value for these macros results in the creation of an array of tuple objects with size PyTuple_MAXSAVESIZE.

When PyTuple_New is called with the parameter size == 0 it makes sure to add a new empty tuple to the list if it doesn't already exist:

if (size == 0) {
    free_list[0] = op;
    ++numfree[0];
    Py_INCREF(op);          /* extra INCREF so that this is never freed */
}

Then, if a new empty tuple is requested, the one that is located in the first position of this list is going to get returned instead of a new instance:

if (size == 0 && free_list[0]) {
    op = free_list[0];
    Py_INCREF(op);
    /* rest snipped for brevity.. */

One additional reason causing an incentive to do this is the fact that function calls construct a tuple to hold the positional arguments that are going to be used. This can be seen in the load_args function in ceval.c:

static PyObject *
load_args(PyObject ***pp_stack, int na)
{
    PyObject *args = PyTuple_New(na);
    /* rest snipped for brevity.. */

which is called via do_call in the same file. If the number of arguments na is zero, an empty tuple is going to be returned.

In essence, this might be an operation that's performed frequently so it makes sense to not reconstruct an empty tuple every single time.


Further reading:

A couple more answers shed light on CPython's caching behaviour with immutables:

  • For integers, another answer that digs in the source can be found here.
  • For strings, a handful of answers can be found here, here and here.
Community
  • 1
  • 1
Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
  • 1
    Would it be correct to say that this behavior does NOT adhere to the documented standard? And therefore is a bug? – vijrox Jul 12 '16 at 14:24
  • 4
    @vijrox definitely not considered a bug; my advice is don't depend on this since there is no reference on them being singletons in the [Python Reference Manual](https://docs.python.org/3/reference/index.html#reference-index). Notable singletons like `None`, `Ellipsis` and others are explicitly mentioned, `()` or `frozenset()` are not therefore it is safe to assume they are considered an implementation detail which one should not rely on. – Dimitris Fasarakis Hilliard Jul 12 '16 at 14:30
  • 2
    Right -- however I was referring to the part of the documentation you pointed out in the question which says "If no argument is given, the constructor creates a new empty tuple, ()." This behavior seems (to me) to exactly contradict that part of the documentation. – vijrox Jul 12 '16 at 14:33
  • 3
    @vijrox ah yes, that did make me scratch my head too. Maybe it is a case of not trying to make the reference manual too confusing; maybe it is written that way as to not enforce other implementations to have that restriction. I can't really know but, I get the confusion. – Dimitris Fasarakis Hilliard Jul 12 '16 at 14:50
  • 9
    @vijrox no, it's just implementation detail. Empty tuples behave 100% accurate to language specification. Language specification does not say, that empty tuple cannot be a singleton. Same as language specification _does not say_ that small integers have to be cached (and yet, [in CPython they are](http://stackoverflow.com/questions/306313/is-operator-behaves-unexpectedly-with-integers)). It's not guaranteed that `() is ()` is True, same as is not guaranteed that `3 is 3` is True, but if it does, it does not break any kind of contract between tuple class and they users. – Łukasz Rogalski Jul 12 '16 at 18:31
  • 1
    The main pint is that, since tuples are immutable, from the point of view of Python code, a new empty tuple is not, for practical proposes, distinguishable from a recycled empty tuple. The behavior could make one could write buggy code if trying to use an empty tuple as a sentinel object, for example, to provide a default behavior, and therefore mistake a "not meant as sentinel" empty tuple as that one. Just use proper sentinels. – jsbueno Jul 14 '16 at 12:48
  • 3
    The pypy implementation for example is different. `() is ()` still returns True, but `() is tuple()` or `tuple() is tuple()` return False. And in jython all three forms return False. – mata Aug 02 '16 at 08:30
  • @mata quite interesting, especially the `PyPy` case. This just goes to show that different implementations implement things according to their needs. Nice comment. – Dimitris Fasarakis Hilliard Aug 02 '16 at 08:37