Why is plus-equals valid for list and dictionary?

Question

Adding a dictionary to a list using the __iadd__ notation seems to add the keys of the dictionary as elements in the list. Why? For example

a = []
b = {'hello':'world'}
a += b
>> a now stores ['hello']

The documentation for plus-equals on collections doesn't imply to me that this should happen:

For instance, to execute the statement x += y, where x is an instance of a class that has an __iadd__() method, x.__iadd__(y) is called. If x is an instance of a class that does not define a __iadd__() method, x.__add__(y) and y.__radd__(x) are considered, as with the evaluation of x + y

But, logically, both

a + b # TypeError Exception

and

b + a # TypeError Exception

Are not defined. Furthermore, b+=a raises a TypeError too. I don't see any special implementation in the source that would explain things, but I'm not 100% sure where to look.

The closest question on SO I found is this one, asking about += on dictionaries, but that's just asking about a data structure with itself. This one had a promising title about list self-addition, but it claims "__add__" is being applied under the hood, which shouldn't be defined between lists and dictionaries.

My best guess is that the __iadd__ is invoking extend, which is defined here, and then it tries to iterate over the dictionary, which in turn yields its keys. But this seems... weird? And I don't see any intuition of that coming from the docs.

when you use `for key in b` then dictionary gives you also `keys`, not `values` . And `list(b)` also gives you `keys`, not `values`. So maybe it works like `a += list(b)` — furas, Dec 09 '20 at 23:58
@assembly From the docs on collections, I expected += to fail whenever + is undefined (I think of it as shorthand for a = a + b, so it succeeding where that fails is surprising to me). Is it just a quirk of their optimization or something, or is there someone more obvious to look where they explain what the operators do? I guess I just misunderstood what the contract was for the operator.. — en_Knight, Dec 09 '20 at 23:59
@furas, that could be, though my guess is it isn't *explicitly* calling 'a += list(b)' - if it is, and you know of an example of the source code or some PEP backing it up, I'd definitely accept that answer! — en_Knight, Dec 10 '20 at 00:01
it is not explicity calling `list(b)` but someone decide that `b` will return only keys when you use it with `for`-loop, `list()` and the same in `+=`. Someone decided that keys can be more useful then `values` — furas, Dec 10 '20 at 00:03
@furas Why doesn't it do the same for `a + b` then? Iterator behaviour is documented but this doesn't seem to be. — Selcuk, Dec 10 '20 at 00:04
`+` and `+=` may not use the same code - and maybe they don't have time to write it in the same way. — furas, Dec 10 '20 at 00:06

assembly_wizard · Accepted Answer · 2020-12-10T00:19:26.083

My best guess is that the iadd is invoking extend, which is defined here, and then it tries to iterate over the dictionary, which in turn yields its keys. But this seems... weird? And I don't see any intuition of that coming from the docs.

This is the correct answer for why this happens. I've found the relevant docs that say this-

In the docs you can see that in fact __iadd__ is equivalent to .extend(), and here it says:

list.extend(iterable): Extend the list by appending all the items from the iterable.

In the part about dicts it says:

Performing list(d) on a dictionary returns a list of all the keys used in the dictionary

So to summarize, a_list += a_dict is equivalet to a_list.extend(iter(a_dict)), which is equivalent to a_list.extend(a_dict.keys()), which will extend the list with the list of keys in the dictionary.

We can maybe discuss on why this is the way things are, but I don't think we will find a clear-cut answer. I think += is a very useful shorthand for .extend, and also that a dictionary should be iterable (personally I'd prefer it returning .items(), but oh well)

Edit: You seem to be interested in the actual implementation of CPython, so here are some code pointers:

dict iterator returning keys:

static PyObject *
dict_iter(PyDictObject *dict)
{
    return dictiter_new(dict, &PyDictIterKey_Type);
}

list.extend(iterable) calling iter() on its argument:

static PyObject *
list_extend(PyListObject *self, PyObject *iterable)
{
    ...
    it = PyObject_GetIter(iterable);
    ...
}

+= being equivalent to list.extend():

static PyObject *
list_inplace_concat(PyListObject *self, PyObject *other)
{
    ...
    result = list_extend(self, other);
    ...
}

and then this method seems to be referenced above inside a PySequenceMethods struct, which seems to be an abstraction of sequences that defines common actions such as concatenating in-place, and concatenating normally (which is defined as list_concat in the same file and you can see is not the same).

Ah, you found those extra sections on lists. Okay, I buy that, thanks! (I also would prefer for it to return .items... but probably not worth breaking all existing code...) — en_Knight, Dec 10 '20 at 00:11
@en_Knight I've added some code pointers since you've asked about the implementation itself, hope it helps :) — assembly_wizard, Dec 10 '20 at 00:20

Why is plus-equals valid for list and dictionary?

1 Answers1