1

How does polymorphism work, under the hood, in python?

In python, if I have some function e.g.

def f(x):
    return x + 2*x + 3*x + 4*x + 5*x + 6*x

then according to dis.dis(f) python translates this to bytecode instructions which describe a cycle of:

  • loading the next constant value
  • loading x again
  • multiplying them together
  • adding the product (onto the accumulation of preceding terms)

But if x is a numpy array or python class, rather than a basic data type, then presumably the interpreter must do additional work (e.g. the binary multiply op-code must somehow lead other functions to be called, perhaps starting with some attribute lookups, which usually correspond to entirely different op-codes). This seems very different from ordinary assembly language, where a simple arithmetic operation would be atomic (and not cause the CPU to also execute extra instructions that aren't visible in the dissassembly listing).

Is there documentation for how the python interpreter operates, and what sequence of steps it actually performs when evaluating an expression involving polymorphism? (Ideally, at a lower level of detail then what a step-through python debugger would expose?)

Edit:

To support polymorphism, an arithmetic operation must also involve not only arithmetic but also type checking, attribute look-up, conditional jumps, and function calls. (All these things have their own op-codes.) Is it correct that cpython implements this by making the arithmetic op-code itself perform many complex actions in a single step of the interpreter (except for the instructions contained in the called function), instead of by stepping the interpreter through a sequence of separate op-codes to achieve the same result (e.g., LOAD_ATTR, CALL_FUNCTION, etc)?

Is there any documentation such as a table, for all op-codes, describing all of the actions each op-code may cause?

benjimin
  • 4,043
  • 29
  • 48
  • 2
    I think [this](https://stackoverflow.com/questions/13334218/where-are-operators-mapped-to-magic-methods-in-python) is a duplicate. Will wait for others to comment. – juanpa.arrivillaga Jul 19 '17 at 03:09
  • 1
    But basically, if you want to know exactly what happens, you have to read the source code. This is implementation dependent. Here is a current [CPython implementation](https://github.com/python/cpython/blob/master/Python/ceval.c#L1191). That is the big switch statement that actually evaluates the op-codes. – juanpa.arrivillaga Jul 19 '17 at 03:17
  • @juanpa.arrivillaga I think `*` for numpy array will never go to the CPython's `op_code` switch but will call a matrix multi function, check my update I think that is the real thing numpy has done undertook, again not sure whether it's the real file, but I'd be very surprised if numpy has done this other way :D – armnotstrong Jul 19 '17 at 03:25
  • @armnotstrong no it *certainly does not*. For starters, `*` in numpy *does not do matrix multiplication*, it does *vectorized/broadcasted multiplication*. Second, that is implemented through the `__mult__` magic methods, like everything else. There are no "basic types" in Python, anyway, that is not a useful distinction to make in a language like Python. As you see in the source code, *everything except unicode-strings are treated the same, only an optimization is done for them*. – juanpa.arrivillaga Jul 19 '17 at 03:30
  • @juanpa.arrivillaga I'm not a mod, but I think you're right. Each binary operator corresponds to an opcode, which is handled by the interpreter in a switch case. This is where Python dispatches to the relevant method, if need be. – SwiftsNamesake Jul 19 '17 at 03:41
  • @juanpa.arrivillaga Can I understand as that the CPython does a `*` op_code switch first and dispatch to the method implemented by `__mult__` if there is one in Class? – armnotstrong Jul 19 '17 at 03:41
  • @armnotstrong see [this](https://stackoverflow.com/a/13335119/5014455) answer, to the potential duplicate – juanpa.arrivillaga Jul 19 '17 at 03:45
  • I think the answer is that yes, most of the opcodes are very complex. Even in x86 assembly language there are at least some instructions that conditionally branch to other instructions (and instructions that achieve the effect of multiple other instructions), but in cpython bytecode a much wider range of opcodes (even seemingly basic arithmetic opcodes) also have the same property, and that is how cpython implements polymorphism. Since the bytecode is implementation and version dependant, it is probably documented nowhere except the cpython source. – benjimin Jan 08 '18 at 21:35
  • The answer would be very different for e.g. C++, where the compiled instructions are not individually polymorphic and instead either multiple versions of machine code are generated (for the same template with different call signatures) or else the code includes an explicit series of instructions to perform a lookup and conditional branching (for inheritance). – benjimin Jan 08 '18 at 22:27

1 Answers1

0

You can define how an operator behaves for a custom class by implementing the corresponding magic method:

 >>> class MyClass(object):
...     def __add__(self, x):
...         return '%s plus %s' % (self, x)
...     def __mul__(self, x):
...         return "%s mul %s" % (self, x)

And I think that's how array in numpy does the job under the hood.

You could trace this for more information about the implementation of * for numpy's array

SwiftsNamesake
  • 1,540
  • 2
  • 11
  • 25
armnotstrong
  • 8,605
  • 16
  • 65
  • 130
  • In this case, what op-codes does the interpreter actually execute? Isn't there a different pair of op-codes involved for looking up the `__add__` attribute and calling the function method? – benjimin Jul 19 '17 at 02:53
  • Have never used `numpy` before, as a math idiot, the only thing I could understand is the mutilation of matrix but I am checking `numpy`'s code for more information to verify this. :D – armnotstrong Jul 19 '17 at 02:57
  • I'm *not* asking what syntax can be used for operator overloading in python. I'm asking how it works at a lower level, at least for cpython. – benjimin Jul 19 '17 at 03:00
  • 1
    It's nothing different from a normal function call once an override for the operator was done, maybe will load a C lib for performance reason, not sure what you are asking – armnotstrong Jul 19 '17 at 03:06
  • You can't really 'override' operators since they're not defined for custom types to begin with. I edited the answer to reflect that. Hope you don't mind. – SwiftsNamesake Jul 19 '17 at 03:36
  • @SwiftsNamesake sure not, thanks for reminding that – armnotstrong Jul 19 '17 at 03:38