What's going on behind the scenes?
In Python, When you assign a value to a key:
dictionary[key] = value
Python translates the above syntactic sugar into:
dictionary.__setitem__(key, value)
As you can see, behind the scenes Python calls the __setitem__
method. The __setitem__
method corresponds directly to the operation of indexing a data structure and assigning a new value to said index. It can be overloaded to customize it's behavior.
The default behavior with __setitem__
for Python dictionaries is to change the key's value if it exists, and if not raise a KeyError
. To prove this, you can subclass the dict
class and overload __setitem__
to display it's arguments:
>>> class Dict(dict):
... def __setitem__(self, key, value):
... print('Putting "%s" in dict with value of "%s"' % (key, value))
... super().__setitem__(key, value)
...
>>>
>>> d = Dict()
>>> d['name'] = 'Hammy'
Putting "name" in dict with value of "Hammy"
>>> d['age'] = 25
Putting "age" in dict with value of "25"
>>> d
{'name': 'Hammy', 'age': 25}
Does Python have an std::map equivalent?
Like @MSeifert said, you can customize what happens when a key is not present by overloading the __missing__
method.
That is what the collections.defaultdict
class does in the standard library. It overloads __missing__
to create a missing key
and map a default value of your choice to it. Here's the relevant snippet from the CPython source:
static PyObject *
defdict_missing(defdictobject *dd, PyObject *key)
{
PyObject *factory = dd->default_factory;
PyObject *value;
/* ... */
value = PyEval_CallObject(factory, NULL);
if (value == NULL)
return value;
if (PyObject_SetItem((PyObject *)dd, key, value) < 0) {
Py_DECREF(value);
return NULL;
}
return value;
}
Note that defaultdict
is implemented in C. Here's an example of the usage:
>>> from collections import defaultdict
>>> map = defaultdict(int)
>>> map['a'] = 1
>>> map['b'] = 2
>>> map['c'] # default factory function `int` called
0
>>> map
defaultdict(<class 'int'>, {'a': 1, 'b': 2, 'c': 0})
defaultdict
pretty much matches the behavior of the std::map::operator[]. If a key is not present when using std::map::operator[], the operator calls a "factory function" that matches the key's value's expected types, and assigns that to the missing key.
So if you want something that behaves like std::map, use defaultdict
. Note I said "like", though. That's because C++ and Python are two complete different languages. Saying a data structure in one language has an exact equivalent in another is very rarely correct.