7

I have a large existing program library that currently has a .NET binding, and I'm thinking about writing a Python binding. The existing API makes extensive use of signature-based overloading. So, I have a large collection of static functions like:

Circle(p1, p2, p3) -- Creates a circle through three points
Circle(p, r)       -- Creates a circle with given center point and radius
Circle(c1, c2, c3) -- Creates a circle tangent to three curves

There are a few cases where the same inputs must be used in different ways, so signature-based overloading doesn't work, and I have to use different function names, instead. For example

BezierCurve(p1,p2,p3,p4) -- Bezier curve using given points as control points
BezierCurveThroughPoints(p1,p2,p3,p4) -- Bezier curve passing through given points

I suppose this second technique (using different function names) could be used everywhere in the Python API. So, I would have

CircleThroughThreePoints(p1, p2, p3)
CircleCenterRadius(p, r)
CircleTangentThreeCurves(c1, c2, c3)

But the names look unpleasantly verbose (I don't like abbreviations), and inventing all of them will be quite a challenge, because the library has thousands of functions.

Low Priorities:
Effort (on my part) -- I don't care if I have to write a lot of code.
Performance

High Priorities:
Ease of use/understanding for callers (many will be programming newbies).
Easy for me to write good documentation.
Simplicity -- avoid the need for advanced concepts in caller's code.

I'm sure I'm not the first person who ever wished for signature-based overloading in Python. What work-arounds do people typically use?

bubba
  • 487
  • 3
  • 12
  • You can simulate signature-based overloading in Python different ways, but for cases like the `BezierCurve` vs `BezierCurveThroughPoints` example you're going to need to use different functions names or add an argument to a single function which is used to differentiate them and provide a means to determine the intended usage of the otherwise identical arguments. – martineau Aug 17 '14 at 00:24
  • Understood. Even in VB and C#, I need different function names in this case. The question was whether or not my Python API should use this approach in all cases. – bubba Aug 17 '14 at 02:38
  • closely related: [What is a clean, Pythonic way to have multiple constructors in Python?](https://stackoverflow.com/q/682504/3780389) – teichert Jul 19 '21 at 21:25

5 Answers5

9

One option is to exclusively keyword arguments in the constructor, and include logic to figure out what should be used:

class Circle(object):
    def __init__(self, points=(), radius=None, curves=()):
        if radius and len(points) == 1:
            center_point = points[0]
            # Create from radius/center point
        elif curves and len(curves) == 3:
            # create from curves
        elif points and len(points) == 3:
            # create from points
        else:
            raise ValueError("Must provide a tuple of three points, a point and a radius, or a tuple of three curves)

You can also use classmethods to make things easier for the users of the API:

class Circle(object):
    def __init__(self, points=(), radius=None, curves=()):
         # same as above

    @classmethod
    def from_points(p1, p2, p3):
        return cls(points=(p1, p2, p3))

    @classmethod
    def from_point_and_radius(cls, point, radius):
        return cls(points=(point,), radius=radius)

    @classmethod
    def from_curves(cls, c1, c2, c3):
        return cls(curves=(c1, c2, c3))

Usage:

c = Circle.from_points(p1, p2, p3)
c = Circle.from_point_and_radius(p1, r)
c = Circle.from_curves(c1, c2, c3)
dano
  • 91,354
  • 19
  • 222
  • 219
  • 1
    `from_points` and friends are known as factory methods, which is a very useful pattern. – Tavian Barnes Aug 16 '14 at 03:01
  • 1
    Your suggestion of `classmethod` alternative constructors is a good one, but for the overloaded `__init__` method I'd suggest using `*args` and inspecting the types, rather than requiring keywords to distinguish between the options. – Blckknght Aug 16 '14 at 03:03
  • Some new ideas to ponder. Thanks. But, at first glance, Circle.from_points((p1,p2,p3)) doesn't seem any better than CircleFromPoints(p1,p2,p3). Am I missing something? – bubba Aug 16 '14 at 03:03
  • @Blckknght -- so, are you suggesting just a single Circle function with a *args input? Then, I suppose my code inside this Circle function could determine the number/types of arguments at run-time, and act accordingly. Is that the idea? I suppose this would work. Is there some reasonable way to document the umpteen different types of tuples that would be legal inputs? – bubba Aug 16 '14 at 03:09
  • 1
    @bubba It's immediately clear that the object being returned when you call `Circle.from_points` is a `Circle`, but it is somewhat ambiguous what `CircleThroughThreePoints` is going to return without first looking at the documentation. It's also an idiom fairly commonly used in Python. The built-in [`datetime`](https://docs.python.org/2/library/datetime.html#datetime-objects) module uses it, for example. – dano Aug 16 '14 at 03:16
  • @dano -- Thanks. I somewhat see your point. To me, it seems obvious that CircleThroughThreePoints will return a Circle object. But, using an idiom that's already common in Python certainly seems beneficial. – bubba Aug 16 '14 at 03:20
  • 1
    @bubba Only providing a constructor/function with `*args` makes it hard for the user to figure out what input is valid, *and* makes it harder for you to figure out if the user provided valid input. The only way for the user to know what input is legal is to read the docs, which is not good API design. And then you have to do a bunch of introspection at runtime to figure out what the user provided and if its valid, which is bug prone on both your end and the users. The more verbose approach is less elegant, but self documenting and less susceptible to bugs. – dano Aug 16 '14 at 03:29
  • @bubba Also, I think Blckknght was suggesting using `*args` instead of named keyword arguments in conjunction with the class methods, rather than *just* provided an `__init__(*args)`. – dano Aug 16 '14 at 03:30
  • @dano -- Documentation/usability were exactly my concern. I don't mind doing the introspection coding, but I don't want to make life difficult for callers. I'm still bothered by the redundancy and repetition. I have a function that receives three points as input; putting "ThreePoints" somewhere in its name seems redundant. I guess that's just life in the Pythonic world, and I'll have to get used to it. Thanks again. – bubba Aug 16 '14 at 03:42
  • 1
    @bubba Yes, it would be nice to have true function overloading. There is [`functools.singledispatch`](https://docs.python.org/3/library/functools.html#functools.singledispatch) in Python 3.x, but it's pretty limited (top-level functions only, overloads strictly based on the first argument). – dano Aug 16 '14 at 03:46
6

There are a couple of options.

You can have one constructor that accepts and arbitrary number of arguments (with *args and/or **varargs syntaxes) and does different things depending on the number and type the arguments have.

Or, you can write secondary constructors as class methods. These are known as "factory" methods. If you have multiple constructors that take the same number of objects of the same classes (as in your BezierCurve example), this is probably your only option.

If you don't mind overriding __new__ rather than __init__, you can even have both, with the __new__ method handling one form of arguments by itself and referring other kinds to the factory methods for regularizing. Here's an example of what that might look like, including doc strings for the multiple signatures to __new__:

class Circle(object):
    """Circle(center, radius) -> Circle object
       Circle(point1, point2, point3) -> Circle object
       Circle(curve1, curve2, curve3) -> Circle object

       Return a Circle with the provided center and radius. If three points are given,
       the center and radius will be computed so that the circle will pass through each
       of the points. If three curves are given, the circle's center and radius will
       be chosen so that the circle will be tangent to each of them."""

    def __new__(cls, *args):
        if len(args) == 2:
            self = super(Circle, cls).__new__(cls)
            self.center, self.radius = args
            return self
        elif len(args) == 3:
            if all(isinstance(arg, Point) for arg in args):
                return Circle.through_points(*args)
            elif all(isinstance(arg, Curve) for arg in args):
                return Circle.tangent_to_curves(*args)
        raise TypeError("Invalid arguments to Circle()")

    @classmethod
    def through_points(cls, point1, point2, point3):
        """from_points(point1, point2, point3) -> Circle object

        Return a Circle that touches three points."""

        # compute center and radius from the points...
        # then call back to the main constructor:
        return cls(center, radius)

    @classmethod
    def tangent_to_curves(cls, curve1, curve2, curve3):
        """from_curves(curve1, curve2, curve3) -> Circle object

        Return a Circle that is tangent to three curves."""

        # here too, compute center and radius from curves ...
        # then call back to the main constructor:
        return cls(center, radius)
Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • This is a good pattern (used extensively by the BDFL himself in Google App Engine's NDB API), but you have several errors. You've used `def` instead of `class` to define the class and none of your class methods take `cls` as a first argument. Finally, it's good style to preface "private" methods (and I use the term loosely) with an underscore. – whereswalden Aug 16 '14 at 21:18
  • Thanks for pointing out my silly typos. I've fixed the `def` and missing `cls` parameters. I left the names without underscores, as the factory methods can be part of the public API (even though they're somewhat redundant, given the main constructor). – Blckknght Aug 17 '14 at 14:13
  • 1
    @Blckknght I think you're missing a `return self` in `__new__`. – dano Aug 18 '14 at 20:22
3

There are a number of modules in PyPI that can help you with signature based overloading and dispatch: multipledispatch, multimethods, Dispatching - none of which I have real experience with, but multipledispatch looks like what you want and it's well documented. Using your circle example:

from multipledispatch import dispatch

class Point(tuple):
    pass

class Curve(object):         
    pass

@dispatch(Point, Point, Point)
def Circle(point1, point2, point3):
    print "Circle(point1, point2, point3): point1 = %r, point2 = %r, point3 = %r" % (point1, point2, point3)

@dispatch(Point, int)
def Circle(centre, radius):
    print "Circle(centre, radius): centre = %r, radius = %r" % (centre, radius)

@dispatch(Curve, Curve, Curve)
def Circle(curve1, curve2, curve3):
    print "Circle(curve1, curve2, curve3): curve1 = %r, curve2 = %r, curve3 = %r" % (curve1, curve2, curve3)


>>> Circle(Point((10,10)), Point((20,20)), Point((30,30)))
Circle(point1, point2, point3): point1 = (10, 10), point2 = (20, 20), point3 = (30, 30)
>>> p1 = Point((25,10))
>>> p1
(10, 10)
>>> Circle(p1, 100)
Circle(centre, radius): centre = (25, 10), radius = 100

>>> Circle(*(Curve(),)*3)
Circle(curve1, curve2, curve3): curve1 = <__main__.Curve object at 0xa954d0>, curve2 = <__main__.Curve object at 0xa954d0>, curve3 = <__main__.Curve object at 0xa954d0>

>>> Circle()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mhawke/virtualenvs/urllib3/lib/python2.7/site-packages/multipledispatch/dispatcher.py", line 143, in __call__
    func = self.resolve(types)
  File "/home/mhawke/virtualenvs/urllib3/lib/python2.7/site-packages/multipledispatch/dispatcher.py", line 184, in resolve
    (self.name, str_signature(types)))
NotImplementedError: Could not find signature for Circle: <>

It's also possible to decorate instance methods, so you can provide multiple implementations of __init__(), which is quite nice. If you were implementing any actual behaviour within the class, e.g. Circle.draw(), you would need some logic to work out what values are available with to draw the circle (centre and radius, 3 points, etc). But as this is just to provide a set of bindings, you probably only need to call the correct native code function and pass on the parameters :

from numbers import Number
from multipledispatch import dispatch

class Point(tuple):
    pass

class Curve(object):
    pass

class Circle(object):
    "A circle class"

    # dispatch(Point, (int, float, Decimal....))
    @dispatch(Point, Number)
    def __init__(self, centre, radius):
        """Circle(Point, Number): create a circle from a Point and radius."""

        print "Circle.__init__(): centre %r, radius %r" % (centre, radius)

    @dispatch(Point, Point, Point)
    def __init__(self, point1, point2, point3):
        """Circle(Point, Point, Point): create a circle from 3 points."""

        print "Circle.__init__(): point1 %r, point2 %r, point3 = %r" % (point1, point2, point3)

    @dispatch(Curve, Curve, Curve)
    def __init__(self, curve1, curve2, curve3):
        """Circle(Curve, Curve, Curve): create a circle from 3 curves."""

        print "Circle.__init__(): curve1 %r, curve2 %r, curve3 = %r" % (curve1, curve2, curve3)

    __doc__ = '' if __doc__ is None else '{}\n\n'.format(__doc__)
    __doc__ += '\n'.join(f.__doc__ for f in __init__.funcs.values())


>>> print Circle.__doc__
A circle class

Circle(Point, Number): create a circle from a Point and radius.
Circle(Point, Point, Point): create a circle from 3 points.
Circle(Curve, Curve, Curve): create a circle from 3 curves.

>>> for num in 10, 10.22, complex(10.22), True, Decimal(100):
...     Circle(Point((10,20)), num)
... 
Circle.__init__(): centre (10, 20), radius 10
<__main__.Circle object at 0x1d42fd0>
Circle.__init__(): centre (10, 20), radius 10.22
<__main__.Circle object at 0x1e3d890>
Circle.__init__(): centre (10, 20), radius (10.22+0j)
<__main__.Circle object at 0x1d42fd0>
Circle.__init__(): centre (10, 20), radius True
<__main__.Circle object at 0x1e3d890>
Circle.__init__(): centre (10, 20), radius Decimal('100')
<__main__.Circle object at 0x1d42fd0>

>>> Circle(Curve(), Curve(), Curve())
Circle.__init__(): curve1 <__main__.Curve object at 0x1e3db50>, curve2 <__main__.Curve object at 0x1d42fd0>, curve3 = <__main__.Curve object at 0x1d4b1d0>
<__main__.Circle object at 0x1d4b4d0>

>>> p1=Point((10,20))
>>> Circle(*(p1,)*3)
Circle.__init__(): point1 (10, 20), point2 (10, 20), point3 = (10, 20)
<__main__.Circle object at 0x1e3d890>

>>> Circle()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mhawke/virtualenvs/urllib3/lib/python2.7/site-packages/multipledispatch/dispatcher.py", line 235, in __call__
    func = self.resolve(types)
  File "/home/mhawke/virtualenvs/urllib3/lib/python2.7/site-packages/multipledispatch/dispatcher.py", line 184, in resolve
    (self.name, str_signature(types)))
NotImplementedError: Could not find signature for __init__: <>
mhawke
  • 84,695
  • 9
  • 117
  • 138
  • Looks useful. Do you know if it supports polymorphic arguments? For example, if `BezierCurve` was derived from `Curve`, would `Circle(bezier1, bezier2, bezier3)` still dispatch to `__init__(self, curve1, curve2, curve3)`? – martineau Aug 16 '14 at 15:26
  • Thanks. In this approach, where does the documentation go? If the user wants to know how to call each of the 3 (or more) functions for creating circles, where does he (or she) look? – bubba Aug 17 '14 at 03:10
  • @martineau -- I'm a Python neophyte, so maybe this is a stupid question. If a Python function is designed to receive objects of certain types as input, then doesn't it *automatically* work if objects of derived types are input?? – bubba Aug 17 '14 at 03:24
  • @bubba: Python functions aren't really "designed" to receive object of any particular type. The only way that can come up is if somewhere in the function, code tries to do something illegal to one of the arguments. As long as they all expose the same methods, it doesn't matter. For example, if a function just adds 1 to every item in a collection, you could pass a list, tuple, dictionary, file object, anything iterator really. If it walks like a duck and quacks like a duck... – whereswalden Aug 17 '14 at 03:45
  • @bubba: re documentation, I think that you'd need to document at the class level, rather than the individual versions of `__init__()`. Then `help(Circle)` would display the class docstring (as well as method docstrings). `Circle.__init__.__doc__` is set by `multipledispatch`. – mhawke Aug 17 '14 at 11:51
  • @bubba: I have updated the overloaded `__init__()` example to include a docstring for each variant, and to add a little code to manipulate the _class_ docstring to include each of the `__init__()` docstrings. I would have preferred to set the docstring for the `__init__()` method, but I couldn't get it to work. – mhawke Aug 17 '14 at 13:41
  • @martineau : yes, within reason that works, i.e. `class BezierCurve(Curve): pass`, dispatches to `__init__(self, curve1, curve2, curve3)`. I say within reason because multiple inheritance can create ambiguities, e.g. `class BezierCurve(Point, Curve): pass` will dispatch to `__init__(point1, point2, point3)` - a little contrived, but something to watch out for. – mhawke Aug 17 '14 at 13:59
  • [This answer](https://stackoverflow.com/a/67927532/3780389) to a possible duplicate question follows a similar approach with [`multimethod`](https://github.com/coady/multimethod). – teichert Jul 19 '21 at 21:23
2

You could use a dictionary, like so

Circle({'points':[p1,p2,p3]})
Circle({'radius':r})
Circle({'curves':[c1,c2,c3])

And the initializer would say

def __init__(args):
  if len(args)>1:
    raise SomeError("only pass one of points, radius, curves")
  if 'points' in args: {blah}
  elsif 'radius' in args: {blahblah}
  elsif 'curves' in args: {evenmoreblah}
  else: raise SomeError("same as above")
Enrico Granata
  • 3,303
  • 18
  • 25
  • Thanks. Is this a common approach, or would people (especially newbies) likely find it bizarre/exotic/confusing? – bubba Aug 16 '14 at 02:58
  • 3
    Rather than this, you probably should use `**kwargs` instead, of you want to force the use of keyword arguments. – Lie Ryan Aug 16 '14 at 03:19
2

One way would be to just write code parse the args yourself. Then you wouldn't have to change the API at all. You could even write a decorator so it'd be reusable:

import functools

def overload(func):
  '''Creates a signature from the arguments passed to the decorated function and passes it as the first argument'''
  @functools.wraps(func)
  def inner(*args):
    signature = tuple(map(type, args))
    return func(signature, *args)
  return inner

def matches(collection, sig):
  '''Returns True if each item in collection is an instance of its respective item in signature'''
  if len(sig)!=len(collection): 
    return False
  return all(issubclass(i, j) for i,j in zip(collection, sig))

@overload
def Circle1(sig, *args):  
  if matches(sig, (Point,)*3):
    #do stuff with args
    print "3 points"
  elif matches(sig, (Point, float)):
    #as before
    print "point, float"
  elif matches(sig, (Curve,)*3):
    #and again
    print "3 curves"
  else:
    raise TypeError("Invalid argument signature")

# or even better
@overload
def Circle2(sig, *args):
  valid_sigs = {(Point,)*3: CircleThroughThreePoints,
                (Point, float): CircleCenterRadius,
                (Curve,)*3: CircleTangentThreeCurves
               }
  try:  
    return (f for s,f in valid_sigs.items() if matches(sig, s)).next()(*args)
  except StopIteration:
    raise TypeError("Invalid argument signature")

How it appears to API users:

This is the best part. To an API user, they just see this:

>>> help(Circle)

Circle(*args)
  Whatever's in Circle's docstring. You should put info here about valid signatures.

They can just call Circle like you showed in your question.

How it works:

The whole idea is to hide the signature-matching from the API. This is accomplished by using a decorator to create a signature, basically a tuple containing the types of each of the arguments, and passing that as the first argument to the functions.

overload:

When you decorate a function with @overload, overload is called with that function as an argument. Whatever is returned (in this case inner) replaces the decorated function. functools.wraps ensures that the new function has the same name, docstring, etc.

Overload is a fairly simple decorator. All it does is make a tuple of the types of each argument and pass that tuple as the first argument to the decorated function.

Circle take 1:

This is the simplest approach. At the beginning of the function, just test the signature against all valid ones.

Circle take 2:

This is a little more fancy. The benefit is that you can define all of your valid signatures together in one place. The return statement uses a generator to filter the matching valid signature from the dictionary, and .next() just gets the first one. Since that entire statement returns a function, you can just stick a () afterwards to call it. If none of the valid signatures match, .next() raises a StopIteration.

All in all, this function just returns the result of the function with the matching signature.

final notes:

One thing you see a lot in this bit of code is the *args construct. When used in a function definition, it just stores all the arguments in a list named "args". Elsewhere, it expands a list named args so that each item becomes an argument to a function (e.g. a = func(*args)).

I don't think it's terribly uncommon to do odd things like this to present clean APIs in Python.

whereswalden
  • 4,819
  • 3
  • 27
  • 41
  • Thanks. That looks promising if it's not too unPythonic. Sorry, but I'm not yet smart enough to understand what you wrote. What would the function calls/usage and the function documentation look like? – bubba Aug 16 '14 at 03:47
  • Interesting approach. However the type-matching that's done is unPythonic (non-polymorphic) as it doesn't support arguments of types derived from those explicitly named. You might be able fix that by using `isinstance()` in the signature matching instead of `type()`. – martineau Aug 16 '14 at 04:33
  • 1
    > doesn't support arguments of types derived from those explicitly named. That would be a big problem. I have many different curve types derived from Curve, and I'd need to be able to receive these as input, for use in the TangentThreeCurves variant. – bubba Aug 16 '14 at 05:12
  • @bubba: I've altered it a bit. Now it'll support derived types. – whereswalden Aug 16 '14 at 05:26
  • Looking pretty good. An(other) enhancement you might want to consider would be to allow sub-tuples of [unrelated] types for each argument to be specified. For example, instead of `(Point, float)` one could write `(Point, (float, int))` so the radius argument could be specified as either a real or integer value. Since the second argument to `issubclass()` can be a tuple, maybe that's already allowable... – martineau Aug 16 '14 at 18:46
  • Not a bad idea, but I'm having a hard time of thinking of a use case since there's the `Number` [ABC](https://docs.python.org/2/library/numbers.html#numbers.Number) to cover `float`s and `int`s, and in most cases, any other types that could fit into the same slots also have an ABC they're both subclassed from. – whereswalden Aug 16 '14 at 21:02
  • Another note: the code in each of these functions can be thrown into the `__init__` method of a class without any real changes. – whereswalden Aug 16 '14 at 21:04
  • ABC `numbers.Number` might encompass too many subclasses (more than just `float`s and `int`s). An aside: on this website you need to include @username when you're addressing a comment to someone so they'll be notified of your post (which happens automatically to the author of a question, which is why I haven't had to in my own). – martineau Aug 17 '14 at 00:09
  • @whereswalden -- the `Number` ABC includes complex numbers, doesn't it? I don't think it makes sense to use a complex number as a circle radius. The `float`, `int`, `decimal` types all make sense, though. – bubba Aug 17 '14 at 03:16
  • @bubba: for that, there's `numbers.Real`. Check out the [documentation](https://docs.python.org/2/library/numbers.html) on `number`. – whereswalden Aug 17 '14 at 14:24