I would like to better understand Python 3.x data model. But I do not find complete and precise explanation of Python Object behaviours.
I am looking for references, it would be great if every case that I show below could be linked to a Python API reference or PEP or anything else valuable. Thank you further for your wise advises...
Let say we have some complex Python structure for testing purposes:
d1 = {
'id': 5432
,'name': 'jlandercy'
,'pets': {
'andy': {
'type': 'cat'
,'age': 3.5
}
,'ray': {
'type': 'dog'
,'age': 6.5
}
}
,'type': str
,'complex': (5432, 6.5, 'cat', str)
,'list': ['milk', 'chocolate', 'butter']
}
1) Immutable atomic objects are singletons
Whatever the way I create a new integer:
n1 = 5432
n2 = int(5432)
n3 = copy.copy(n1)
n4 = copy.deepcopy(n1)
No new copy of this number is created, instead it points towards the same object as d1['id']
. More concisely
d1['id'] is n1
...
They all do have the same id
, I cannot create a new instance of int
with value 5432, therefore it is a singleton.
2) Immutable and Iterable objects might be singletons...
Previous observation also works for str
, which are immutable and iterable. All following variables:
s1 = 'jlandercy'
s2 = str('jlandercy')
s3 = copy.copy(s1)
s4 = copy.deepcopy(s1)
Point towards the copy initially created d1['name']
. String are also singletons.
...but not exactly...
Tuple are also immutable and iterable, but they do not behave like string. It is know that the magic empty tuple is a singleton:
() is ()
But other tuples are not.
t1 = (5432, 6.5, 'cat', str)
...instead they hash equally
They do not have the same id
:
id(d1['complex']) != id(t1)
But all items within those two structures are atomic, so they point towards same instances. The important point is, both structures hash
the same way:
hash(d1['complex']) == hash(t1)
So they can be used as dictionary keys. This is even true for nested tuples:
t2 = (1, (2, 3))
t3 = (1, (2, 3))
They do have the same hash
.
3) Passing dictionary by double dereferencing works as shallow copy of it
Lets define the following function:
def f1(**kwargs):
kwargs['id'] = 1111
kwargs['pets']['andy'] = None
Which will receive our trial dictionary by double dereferencing (**
operator) first degree members will be copied, but deepest will be passed by reference.
Output of this simple program, illustrates it:
print(d1)
f1(**d1)
print(d1)
It returns:
{'complex': (5432, 6.5, 'cat', <class 'str'>),
'id': 5432,
'list': ['milk', 'chocolate', 'butter'],
'name': 'jlandercy',
'pets': {'andy': {'age': 3.5, 'type': 'cat'},
'ray': {'age': 6.5, 'type': 'dog'}},
'type': <class 'str'>}
{'complex': (5432, 6.5, 'cat', <class 'str'>),
'id': 5432,
'list': ['milk', 'chocolate', 'butter'],
'name': 'jlandercy',
'pets': {'andy': None, 'ray': {'age': 6.5, 'type': 'dog'}},
'type': <class 'str'>}
The dictionary d1
has been modified by function f1
, but not completely. Member id
'is kept back because we worked on a copy, but member pets
is also a dictionary and the shallow copy did not copy it, then it has been modified.
This behaviour is similar to copy.copy
behaviour for dict
object. Where we need copy.deepcopy
to have a recursive and complete copy of object.
My requests are:
Are my observations correctly interpreted?
Immutable atomic objects are singletons
Immutable and Iterable objects might be singletons but not exactly instead they hash equally
Passing dictionary by double dereferencing works as shallow copy of it
- Are those behaviours well documented somewhere?
- For each case states correct properties & behaviours.