18

How do you override the result of unpacking syntax *obj and **obj?

For example, can you somehow create an object thing which behaves like this:

>>> [*thing]
['a', 'b', 'c']
>>> [x for x in thing]
['d', 'e', 'f']
>>> {**thing}
{'hello world': 'I am a potato!!'}

Note: the iteration via __iter__ ("for x in thing") returns different elements from the *splat unpack.

I had a look inoperator.mul and operator.pow, but those functions only concern usages with two operands, like a*b and a**b, and seem unrelated to splat operations.

wim
  • 338,267
  • 99
  • 616
  • 750
  • 2
    im 99% sure you cannot ... but would love to be proved wrong here (see http://stackoverflow.com/questions/9722272/overload-operator-in-python-or-emulate-it) – Joran Beasley Mar 12 '14 at 23:17
  • You should be able to just implement the iterable or mapping protocols. I'm having strange problems getting the mapping to work right, though. – user2357112 Mar 12 '14 at 23:24

2 Answers2

28

* iterates over an object and uses its elements as arguments. ** iterates over an object's keys and uses __getitem__ (equivalent to bracket notation) to fetch key-value pairs. To customize *, simply make your object iterable, and to customize **, make your object a mapping:

class MyIterable(object):
    def __iter__(self):
        return iter([1, 2, 3])

class MyMapping(collections.Mapping):
    def __iter__(self):
        return iter('123')
    def __getitem__(self, item):
        return int(item)
    def __len__(self):
        return 3

If you want * and ** to do something besides what's described above, you can't. I don't have a documentation reference for that statement (since it's easier to find documentation for "you can do this" than "you can't do this"), but I have a source quote. The bytecode interpreter loop in PyEval_EvalFrameEx calls ext_do_call to implement function calls with * or ** arguments. ext_do_call contains the following code:

        if (!PyDict_Check(kwdict)) {
            PyObject *d;
            d = PyDict_New();
            if (d == NULL)
                goto ext_call_fail;
            if (PyDict_Update(d, kwdict) != 0) {

which, if the ** argument is not a dict, creates a dict and performs an ordinary update to initialize it from the keyword arguments (except that PyDict_Update won't accept a list of key-value pairs). Thus, you can't customize ** separately from implementing the mapping protocol.

Similarly, for * arguments, ext_do_call performs

        if (!PyTuple_Check(stararg)) {
            PyObject *t = NULL;
            t = PySequence_Tuple(stararg);

which is equivalent to tuple(args). Thus, you can't customize * separately from ordinary iteration.

It'd be horribly confusing if f(*thing) and f(*iter(thing)) did different things. In any case, * and ** are part of the function call syntax, not separate operators, so customizing them (if possible) would be the callable's job, not the argument's. I suppose there could be use cases for allowing the callable to customize them, perhaps to pass dict subclasses like defaultdict through...

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Yeah I already know this much. I'm talking about customising splat independently of `__iter__`. I've added a note in my question to try and make it more explicit that I am talking about the same object `thing` – wim Mar 12 '14 at 23:31
  • 1
    @wim: Then no. It'd be horribly confusing. – user2357112 Mar 12 '14 at 23:32
  • I don't disagree. But I want to know how these things fit into the grammar/language because they seem to be qualitatively kinda different from other operators. – wim Mar 12 '14 at 23:34
  • 1
    @wim: They're not separate operators. They're part of the function call syntax. You can't customize them separately for the same reason you can't customize what happens when something gets passed as a regular argument. – user2357112 Mar 12 '14 at 23:38
  • I believe you about that, but I think that a proper answer should cite a reference or provide some evidence that this is true, rather than just assert that it is so because it would be "horribly confusing" otherwise. Note that `f(*thing)` and `f(*iter(thing))` disassemble to different byte code. – wim Mar 12 '14 at 23:48
  • 1
    @wim: The documentation just says "this is what `*` and `**` do", not "you can't make them do something else". I guess I'll go hunt down which part of the source implements the relevant opcodes. – user2357112 Mar 13 '14 at 00:10
  • so it looks like, in cpython at least, splat and splatsplat is not something that the object to the right hand side ever sees or indeed even knows about at all. it's part of the grammar instead. – wim Mar 14 '14 at 19:54
  • 1
    I see a PEP for two new magic method coming out of this: `__splat__` and `__splatty_splat__`, which default back to `__iter__` and `__iter__`-with-`__getitem__` under normal circumstances. – Mad Physicist Jan 18 '18 at 16:44
  • @MadPhysicist splatty-splat is `.keys()` with `__getitem__`. The `__iter__` not required. – wim Jan 18 '18 at 18:04
  • @wim. So it won't try `__iter__` as a backup, in the same way that `__iter__` fails over to `__len__` and `__getitem__`? – Mad Physicist Jan 18 '18 at 19:56
  • No, it will not. – wim Jan 18 '18 at 19:59
  • @MadPhysicist: `keys` is the heuristic it uses to decide whether an object is a mapping at all, so if the object you're trying to `**` unpack doesn't have `keys`, Python doesn't think it should be `**` unpackable. – user2357112 Jan 18 '18 at 20:11
  • @user2357112. TIL. It's been a very productive day for me, at your and wim's expense. Thank you both. – Mad Physicist Jan 19 '18 at 02:27
  • @user2357112supportsMonica I just posted basically the same question over [here](https://stackoverflow.com/q/62492107/5472354), which got promptly marked as a duplicate and rightfully so. However, before I even posted that question I tried out the first version you suggest implementing `__iter__`, and it failed. So then I saw you suggesting the same, and I ran your code, and it failed. What am I missing? Is this not up to date anymore? Do you *have* to subclass a mapping now? – mapf Jun 20 '20 at 22:30
  • @mapf: The `__iter__` one is for `*` unpacking, not `**` unpacking. – user2357112 Jun 20 '20 at 22:45
  • @user2357112supportsMonica thanks! I realized that now as well. I talked about this with another user in my post who pointed out your answer to me. He suggested implementing the `__getitem__` and `keys` methods for the ** unpacking to work without having to subclass `Mapping` which is what my question was originally about. Maybe you could include that in your answer as well? – mapf Jun 20 '20 at 22:49
  • 2
    @mapf: It technically only looks for `keys` and `__getitem__` in the current CPython implementation, but the [language reference](https://docs.python.org/3/reference/expressions.html#calls) specifies that the unpacked argument must be a mapping: "If the syntax `**expression` appears in the function call, `expression` must evaluate to a mapping, the contents of which are treated as additional keyword arguments." Implementing just enough to get `**` to take your object is a recipe for bugs and confusion, so I've deliberately excluded it from my answer. – user2357112 Jun 20 '20 at 23:01
  • @user2357112supportsMonica I see, so only implementing `keys` and `__getitem__` would be more of a hack I guess. It does work though. I wonder in what context it would fail. Thanks a lot for elaborating! – mapf Jun 21 '20 at 08:15
2

I did succeed in making an object that behaves how I described in my question, but I really had to cheat. So just posting this here for fun, really -

class Thing:
    def __init__(self):
        self.mode = 'abc'
    def __iter__(self):
        if self.mode == 'abc':
            yield 'a'
            yield 'b'
            yield 'c'
            self.mode = 'def'
        else:
            yield 'd'
            yield 'e'
            yield 'f'
            self.mode = 'abc'
    def __getitem__(self, item):
        return 'I am a potato!!'
    def keys(self):
        return ['hello world']

The iterator protocol is satisfied by a generator object returned from __iter__ (note that a Thing() instance itself is not an iterator, though it is iterable). The mapping protocol is satisfied by the presence of keys() and __getitem__. Yet, in case it wasn't already obvious, you can't call *thing twice in a row and have it unpack a,b,c twice in a row - so it's not really overriding splat like it pretends to be doing.

wim
  • 338,267
  • 99
  • 616
  • 750
  • It's easy enough to make `*thing` and `**thing` at least always act as you like without depending on order--just define `def keys(self): return ('hello world',)` – Nick Matteo Apr 13 '17 at 18:04
  • Is there any particular reason you didn't have `__len__` return `1`? Also, any reason you need to extend `Mapping`? – Mad Physicist Jan 18 '18 at 16:31
  • @MadPhysicist If you don't inherit `Mapping`, you'll need to [quack like a mapping](https://stackoverflow.com/q/40667093/674039). In the context of `Thing`, it means we must define a `keys` method. If you do inherit `Mapping`, you are required to define the abstract method `__len__`, but I don't care what it returns here - just that the name resolves. – wim Jan 18 '18 at 17:59
  • @wim. I always thought you could get away with just `__len__` returning 1. Interesting. – Mad Physicist Jan 18 '18 at 19:55