5

I need to deal with a two objects of a class in a way that will return a third object of the same class, and I am trying to determine whether it is better to do this as an independent function that receives two objects and returns the third or as a method which would take one other object and return the third.


For a simple example. Would this:

from collections import namedtuple

class Point(namedtuple('Point', 'x y')):
    __slots__ = ()
    #Attached to class
    def midpoint(self, otherpoint):
        mx = (self.x + otherpoint.x) / 2.0
        my = (self.y + otherpoint.y) / 2.0
        return Point(mx, my)

a = Point(1.0, 2.0)
b = Point(2.0, 3.0)

print a.midpoint(b)
#Point(x=1.5, y=2.5)

Or this:

from collections import namedtuple

class Point(namedtuple('Point', 'x y')):
    __slots__ = ()


#not attached to class
#takes two point objects
def midpoint(p1, p2):
    mx = (p1.x + p2.x) / 2.0
    my = (p1.y + p2.y) / 2.0
    return Point(mx, my)


a = Point(1.0, 2.0)
b = Point(2.0, 3.0)

print midpoint(a, b)
#Point(x=1.5, y=2.5)

and why would one be preferred over the other?


This seems far less clear cut than I had expected when I asked the question.

In summary, it seems that something like a.midpoint(b) is not preferred since it seems to give a special place to one point or another in what is really a symmetric function that returns a completely new point instance. But it seems to be largely a matter of taste and style between something like a freestanding module function or a function attached to the class, but not meant to be called by the insance, such as Point.midpoint(a, b).

I think, personally, I stylistically lean towards free-standing module functions, but it may depend on the circumstances. In cases where the function is definitely tightly bound to the class and there is any risk of namespace pollution or potential confusion, then making a class function probably makes more sense.

Also, a couple of people mentioned making the function more general, perhaps by implementing additional features of the class to support this. In this particular case dealing with points and midpoints, that is probably the overall best approach. It supports polymorphism and code reuse and is highly readable. In a lot of cases though, that would not work (the project that inspired me to ask this for instance), but points and midpoints seemed like a concise and understandable example to illustrate the question.

Thank you all, it was enlightening.

TimothyAWiseman
  • 14,385
  • 12
  • 40
  • 47

6 Answers6

3

The first approach is reasonable and isn't conceptually different from what set.union and set.intersection do. Any func(Point, Point) --> Point is clearly related to the Point class, so there is no question about interfering with the unity or cohesion of the class.

It would be a tougher choice if different classes were involved: draw_perpendicular(line, point) --> line. To resolve the choice of classes, you would pick the one that has the most related logic. For example, str.join needs a string delimiter and a list of strings. It could have been a standalone function (as it was in the old days with the string module), or it could be a method on lists (but it only works for lists of strings), or a method on strings. The latter was chosen because joining is more about strings than it is about lists. This choice was made eventhough it led to the arguably awkward expression delimiter.join(things_to_join).

I disagree with the other respondent who recommended using a classmethod. Those are often used for alternate constructor signatures but not for transformations on instances of the class. For example, datetime.fromordinal is a classmethod for constructing a date from something other than an instance of the class (in this case, an from an int). This contrasts with datetime.replace which is a regular method for making a new datetime instance based on an existing instance. This should steer you away from using classmethod for the midpoint computation.

One other thought: if you keep midpoint() with the Point() class, it makes it possible to create other classes that have the same Point API but a different internal representation (i.e. polar coordinates may be more convenient for some types of work than Cartesian coordinates). If midpoint() is a separate function you start to lose the benefits of encapsulation and of a coherent interface.

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • IMO, ``midpoint`` can be considered an alternate constructor (for creating a ``Point`` at the midpoint of two other points). – codeape Oct 26 '11 at 00:01
  • See also this discussion http://stackoverflow.com/questions/38238/what-are-class-methods-in-python-for about the purposes of classmethods. This classmethod usage seems to fit the bill in the top voted answer there pretty well...and it is the only way to get the appropriate inheritance. – Josh Bleecher Snyder Oct 26 '11 at 00:08
  • I'm dismayed at the profound misunderstand of classmethods should in this thread. Would you make int.__add__ into a classmethod because it "constructs" a new integer from two existing integers? How about set.difference or dict.copy or namedtuple.replace or str.replace? – Raymond Hettinger Oct 26 '11 at 00:13
  • 1
    No, I wouldn't make int.__add__ into a classmethod, I believe we all agree that ``2 + 2`` is the most natural way to add numbers. Re. namedtuple.replace and str.replace: I don't think the comparison is 100% valid. namedtuple.replace/str.replace returns a new instance that is a modified version of the original instance. midpoint() returns a value that is calculated as "a combination" of two instances. – codeape Oct 26 '11 at 00:35
2

I would choose the second option because, in my opinion, it is clearer than the first. You are performing the midpoint operation between two points; not the midpoint operation with respect to a point. Similarly, a natural extension of this interface could be to define dot, cross, magnitude, average, median, etc. Some of those functions will operate on pairs of Points and others may operate on lists. Making it a function makes them all have consistent interfaces.

Defining it as a function also allows it to be used with any pair of objects that present a .x .y interface, while making it a method requires that at least one of the two is a Point.

Lastly, to address the location of the function, I believe it makes sense to co-locate it in the same package as the Point class. This places it in the same namespace, which clearly indicates its relationship with Point and, in my opinion, is more pythonic than a static or class method.

Update: Further reading on the Pythonicness of @staticmethod vs package/module:

In both Thomas Wouter's answer to the question What is the difference between staticmethod and classmethod in Python and Mike Steder's answer to init and arguments in Python, the authors indicated that a package or module of related functions is perhaps a better solution. Thomas Wouter has this to say:

[staticmethod] is basically useless in Python -- you can just use a module function instead of a staticmethod.

While Mike Steder comments:

If you find yourself creating objects that consist of nothing but staticmethods the more pythonic thing to do would be to create a new module of related functions.

However, codeape rightly points out below that a calling convention of Point.midpoint(a,b) will co-locate the functionality with the type. The BDFL also seems to value @staticmethod as the __new__ method is a staticmethod.

My personal preference would be to use a function for the reasons cited above, but it appears that the choice between @staticmethod and a stand-alone function are largely in the eye of the beholder.

Community
  • 1
  • 1
Nathan
  • 4,777
  • 1
  • 28
  • 35
  • 1
    Re. your last point: I believe that locating the method as a ``@classmethod`` inside the Point class makes that relationship even clearer. – codeape Oct 26 '11 at 00:03
  • 1
    +1 for "You are performing the midpoint operation between two points; not the midpoint operation with respect to a point." The syntax ``p1.midpoint(p2)`` IMO indicates that ``p1`` and ``p2`` play different roles. (Point.)``midpoint(p1, p2)`` looks better and easier to understand to me. – codeape Oct 26 '11 at 00:09
  • @codeape: I hope that I incorporated your point into my answer. I think it comes down to personal preference which is better: a `@staticmethod` or a function. – Nathan Oct 26 '11 at 01:18
2

In this case you can use operator overloading:

from collections import namedtuple

class Point(namedtuple('Point', 'x y')):
    __slots__ = ()
    #Attached to class
    def __add__(self, otherpoint):
        mx = (self.x + otherpoint.x)
        my = (self.y + otherpoint.y)
        return Point(mx, my)

    def __div__(self, scalar):
        return Point(self.x/scalar, self.y/scalar)


a = Point(1.0, 2.0)
b = Point(2.0, 3.0)

def mid(a,b): # general function
    return (a+b)/2

print mid(a,b)

I think the decision mostly depends on how general and abstract the function is. If you can write the function in a way that works on all objects that implement a small set of clean interfaces, then you can turn it into a separate function. The more interfaces your function depends on and the more specific they are, the more it makes sense to put it on the class (as instances of this class will most likely be the only objects the function will work with anyways).

Jochen Ritzel
  • 104,512
  • 31
  • 200
  • 194
  • An excellent point for this particular situation, but it doesn't really fit for the real project I am working on. It just seemed better to use something short and simple for the example in the question. – TimothyAWiseman Oct 26 '11 at 08:01
1

I would choose version one, because this way all functionality for points is stored in the point class, i.e. grouping related functionality. Additionally, point objects know best about the meaning and inner workings of their data, so it's the right place to implement your function. An external function, for example in C++, would have to be a friend, which smells like a hack.

hochl
  • 12,524
  • 10
  • 53
  • 87
  • +1 on the comment about point objects knowing the best meaning of and inner workings of their data. That corresponds to the Information Expert in GRASP: http://en.wikipedia.org/wiki/GRASP_(object-oriented_design) . -1 on the suggestion to use a staticmethod -- those are used to attach regular functions to classes with related functionality but that don't need to know anything about instances of the class. – Raymond Hettinger Oct 25 '11 at 23:48
  • 1
    After some research I must contradict, @staticmethod is indeed a good place for this functionality. Reading Nathan above now even seems to confirm this. Your -1 thus is not justified. – hochl Oct 26 '11 at 09:58
  • The recommendation to use @staticmethod is bad advice. Grep for staticmethod in Lib/*.py in the standard library. You'll see that no public method has been made static. This should be a pretty strong clue that the core developers consider the use of staticmethods to be a rare use case. – Raymond Hettinger Oct 26 '11 at 16:14
  • I removed it again so that my response contains no unworthy information :) – hochl Oct 26 '11 at 16:39
1

Another option is to use a @classmethod. It is probably what I would prefer in this case.

class Point(...):
    @classmethod
    def midpoint(cls, p1, p2):
        mx = (p1.x + p2.x) / 2.0
        my = (p1.y + p2.y) / 2.0
        return cls(mx, my)

# ...
print Point.midpoint(a, b)
codeape
  • 97,830
  • 24
  • 159
  • 188
0

A different way of doing this is to access x and y through the namedtuple's subscript interface. You can then completely generalize the midpoint function to n dimensions.

class Point(namedtuple('Point', 'x y')):
    __slots__ = ()

def midpoint(left, right):
    return tuple([sum(a)/2. for a in zip(left, right)])

This design works for Point classes, n-tuples, lists of length n, etc. For example:

>>> midpoint(Point(0,0), Point(1,1))
(0.5, 0.5)
>>> midpoint(Point(5,1), (3, 2))
(4.0, 1.5)
>>> midpoint((1,2,3), (4,5,6))
(2.5, 3.5, 4.5)
Nathan
  • 4,777
  • 1
  • 28
  • 35