132

The Zen of Python states that there should only be one way to do things- yet frequently I run into the problem of deciding when to use a function versus when to use a method.

Let's take a trivial example- a ChessBoard object. Let's say we need some way to get all the legal King moves available on the board. Do we write ChessBoard.get_king_moves() or get_king_moves(chess_board)?

Here are some related questions I looked at:

The answers I got were largely inconclusive:

Why does Python use methods for some functionality (e.g. list.index()) but functions for other (e.g. len(list))?

The major reason is history. Functions were used for those operations that were generic for a group of types and which were intended to work even for objects that didn’t have methods at all (e.g. tuples). It is also convenient to have a function that can readily be applied to an amorphous collection of objects when you use the functional features of Python (map(), apply() et al).

In fact, implementing len(), max(), min() as a built-in function is actually less code than implementing them as methods for each type. One can quibble about individual cases but it’s a part of Python, and it’s too late to make such fundamental changes now. The functions have to remain to avoid massive code breakage.

While interesting, the above doesn't really say much as to what strategy to adopt.

This is one of the reasons - with custom methods, developers would be free to choose a different method name, like getLength(), length(), getlength() or whatsoever. Python enforces strict naming so that the common function len() can be used.

Slightly more interesting. My take is that functions are in a sense, the Pythonic version of interfaces.

Lastly, from Guido himself:

Talking about the Abilities/Interfaces made me think about some of our "rogue" special method names. In the Language Reference, it says, "A class can implement certain operations that are invoked by special syntax (such as arithmetic operations or subscripting and slicing) by defining methods with special names." But there are all these methods with special names like __len__ or __unicode__ which seem to be provided for the benefit of built-in functions, rather than for support of syntax. Presumably in an interface-based Python, these methods would turn into regularly-named methods on an ABC, so that __len__ would become

class container:
  ...
  def len(self):
    raise NotImplemented

Though, thinking about it some more, I don't see why all syntactic operations wouldn't just invoke the appropriate normally-named method on a specific ABC. "<", for instance, would presumably invoke "object.lessthan" (or perhaps "comparable.lessthan"). So another benefit would be the ability to wean Python away from this mangled-name oddness, which seems to me an HCI improvement.

Hm. I'm not sure I agree (figure that :-).

There are two bits of "Python rationale" that I'd like to explain first.

First of all, I chose len(x) over x.len() for HCI reasons (def __len__() came much later). There are two intertwined reasons actually, both HCI:

(a) For some operations, prefix notation just reads better than postfix -- prefix (and infix!) operations have a long tradition in mathematics which likes notations where the visuals help the mathematician thinking about a problem. Compare the easy with which we rewrite a formula like x*(a+b) into x*a + x*b to the clumsiness of doing the same thing using a raw OO notation.

(b) When I read code that says len(x) I know that it is asking for the length of something. This tells me two things: the result is an integer, and the argument is some kind of container. To the contrary, when I read x.len(), I have to already know that x is some kind of container implementing an interface or inheriting from a class that has a standard len(). Witness the confusion we occasionally have when a class that is not implementing a mapping has a get() or keys() method, or something that isn't a file has a write() method.

Saying the same thing in another way, I see 'len' as a built-in operation. I'd hate to lose that. I can't say for sure whether you meant that or not, but 'def len(self): ...' certainly sounds like you want to demote it to an ordinary method. I'm strongly -1 on that.

The second bit of Python rationale I promised to explain is the reason why I chose special methods to look __special__ and not merely special. I was anticipating lots of operations that classes might want to override, some standard (e.g. __add__ or __getitem__), some not so standard (e.g. pickle's __reduce__ for a long time had no support in C code at all). I didn't want these special operations to use ordinary method names, because then pre-existing classes, or classes written by users without an encyclopedic memory for all the special methods, would be liable to accidentally define operations they didn't mean to implement, with possibly disastrous consequences. Ivan Krstić explained this more concise in his message, which arrived after I'd written all this up.

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

My understanding of this is that in certain cases, prefix notation just makes more sense (ie, Duck.quack makes more sense than quack(Duck) from a linguistic standpoint.) and again, the functions allow for "interfaces".

In such a case, my guess would be to implement get_king_moves based solely on Guido's first point. But that still leaves a lot of open questions regarding say, implementing a stack and queue class with similar push and pop methods- should they be functions or methods? (here I would guess functions, because I really want to signal a push-pop interface)

TLDR: Can someone explain what the strategy for deciding when to use functions vs. methods should be?

Ceasar
  • 22,185
  • 15
  • 64
  • 83
  • 2
    Meh, I always thought of that as utterly arbitrary. Duck typing allows for implicit "interfaces", it doesn't make much difference whether you have `X.frob` or `X.__frob__` and free-standing `frob`. – Cat Plus Plus Nov 13 '11 at 00:51
  • 2
    While I mostly agree with you, in principle your answer isn't Pythonic. Recall, "In the face of ambiguity, refuse the temptation to guess." (Of course, deadlines would change this, but I'm doing this for fun / self-improvement.) – Ceasar Nov 13 '11 at 01:04
  • This is one thing i do not like about python. I feel if you are going to force cast typing like an int to a string, then just make it a method. It's annoying to have to enclose it in parens and time consuming. – Matt Nov 13 '11 at 06:44
  • 1
    This is the most important reason I don't like Python: you never know whether you have to look for a function or a method when you want to achieve something. And it even gets more convoluted when you use additional libraries with new data types like vectors or data frames. – vonjd Nov 09 '17 at 16:42
  • 1
    *"The Zen of Python states that there should only be one way to do things"* except it doesn't. – gented Jul 17 '18 at 08:26
  • @gented It says `There should be one-- and preferably only one --obvious way to do it`. It seems the OPs paraphrase is perfectly reasonable here we all know what they mean. – eric Feb 09 '20 at 03:49
  • Came here to say I'm glad this question wasn't closed as opinion based. Great research went into it, and useful answers. – eric Feb 09 '20 at 03:50

6 Answers6

100

My general rule is this - is the operation performed on the object or by the object?

if it is done by the object, it should be a member operation. If it could apply to other things too, or is done by something else to the object then it should be a function (or perhaps a member of something else).

When introducing programming, it is traditional (albeit implementation incorrect) to describe objects in terms of real-world objects such as cars. You mention a duck, so let's go with that.

class duck: 
    def __init__(self):pass
    def eat(self, o): pass 
    def crap(self) : pass
    def die(self)
    ....

In the context of the "objects are real things" analogy, it is "correct" to add a class method for anything which the object can do. So say I want to kill off a duck, do I add a .kill() to the duck? No... as far as I know animals do not commit suicide. Therefore if I want to kill a duck I should do this:

def kill(o):
    if isinstance(o, duck):
        o.die()
    elif isinstance(o, dog):
        print "WHY????"
        o.die()
    elif isinstance(o, nyancat):
        raise Exception("NYAN "*9001)
    else:
       print "can't kill it."

Moving away from this analogy, why do we use methods and classes? Because we want to contain data and hopefully structure our code in a manner such that it will be reusable and extensible in the future. This brings us to the notion of encapsulation which is so dear to OO design.

The encapsulation principal is really what this comes down to: as a designer you should hide everything about the implementation and class internals which it is not absolutely necessarily for any user or other developer to access. Because we deal with instances of classes, this reduces to "what operations are crucial on this instance". If an operation is not instance specific, then it should not be a member function.

TL;DR: what @Bryan said. If it operates on an instance and needs to access data which is internal to the class instance, it should be a member function.

charlie80
  • 806
  • 1
  • 7
  • 17
arrdem
  • 2,365
  • 1
  • 16
  • 18
  • So in short, non-member functions operate on immutable objects, mutable objects use member-functions? (Or is this too strict a generalization? This for certain works only because immutable types have no state.) – Ceasar Nov 13 '11 at 01:37
  • 1
    From a strict OOP standpoint I guess that is fair. As Python has both public and private variables (variables with names beginning in __) and provides zero guaranteed access protection unlike Java there are no absolutes simply because we are debating a permissive language. In a less permissive language like Java however, remember that getFoo() ad setFoo() functions are the norm, so immutability is not absolute. Client code is just not allowed to make assignments to members. – arrdem Nov 13 '11 at 02:26
  • 1
    @Ceasar That's not true. Immutable objects have state; otherwise there would be nothing to tell apart any integer from any other. Immutable objects don't **change** their state. In general, this makes it much less problematic for all their state to be public. And in that setting, it's much easier to meaningfully manipulate an immutable all-public object with functions; there's no "privilege" of being a method. – Ben Nov 13 '11 at 02:28
  • Sorry, I think you've missed the point. It's about modelling and semantics and what sounds more natural and/or is more useful. – Karl Knechtel Nov 13 '11 at 02:30
  • 1
    @CeasarBautista yeah, Ben has a point. There are three "major" schools of code design 1) not designed, 2) OOP and 3) functional. in a functional style, there are no states at all. This is the way that I see most python code designed, it takes input and generates output with few side-effects. The __point__ of OOP is that everything has a state. Classes are a container for states, and "member" functions therefore are state-dependent and side-effect based code which loads the state defined in the class whenever invoked. Python tends to lean functional, hence the preference of non-member code. – arrdem Nov 13 '11 at 02:40
  • 1
    eat(self), crap(self), die(self). hahahaha – Wapiti Oct 05 '15 at 18:10
  • The _is_ keyword does not test type but identity. – skywalker Jul 12 '17 at 05:55
  • myString.len() is *on* an object yet we have to do this convoluted len(myString). I don't see how this answer were accurate. – WestCoastProjects Aug 27 '18 at 04:17
27

Use a class when you want to:

1) Isolate calling code from implementation details -- taking advantage of abstraction and encapsulation.

2) When you want to be substitutable for other objects -- taking advantage of polymorphism.

3) When you want to reuse code for similar objects -- taking advantage of inheritance.

Use a function for calls that make sense across many different object types -- for example, the builtin len and repr functions apply to many kinds of objects.

That being said, the choice sometimes comes down to a matter of taste. Think in terms of what is most convenient and readable for typical calls. For example, which would be better (x.sin()**2 + y.cos()**2).sqrt() or sqrt(sin(x)**2 + cos(y)**2)?

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
9

I usually think of an object like a person.

Attributes are the person's name, height, shoe size, etc.

Methods and functions are operations that the person can perform.

If the operation could be done by just any ol' person, without requiring anything unique to this one specific person (and without changing anything on this one specific person), then it's a function and should be written as such.

If an operation is acting upon the person (e.g. eating, walking, ...) or requires something unique to this person to get involved (like dancing, writing a book, ...), then it should be a method.

Of course, it is not always trivial to translate this into the specific object you're working with, but I find it is a good way to think of it.

Thriveth
  • 377
  • 1
  • 6
  • 18
  • 1
    but you can measure any old person's height, so by that logic it should be `height(person)`, not `person.height`? – endolith Sep 10 '14 at 14:54
  • @endolith Sure, but I'd say height is better off as an attribute because you don't need to perform any fancy work to retrieve it. Writing a function to retrieve a number seems like unnecessary hoops to jump through. – Thriveth Sep 11 '14 at 07:43
  • @endolith On the other hand, if the person is a kid who grows and changes height, it would be obvious to let a method take care of that, not a function. – Thriveth Sep 11 '14 at 07:44
  • This doesn't hold true. What if you have `are_married(person1, person2)`? This query is very general and should thus be a function and not a method. – Pithikos Sep 25 '14 at 11:14
  • @Pithikos Sure, but that also involves more than just an attribute or action performed on one Object (Person), but rather a relation between two objects. – Thriveth Sep 26 '14 at 13:59
7

Here's a simple rule of thumb: if the code acts upon a single instance of an object, use a method. Even better: use a method unless there is a compelling reason to write it as a function.

In your specific example, you want it to look like this:

chessboard = Chessboard()
...
chessboard.get_king_moves()

Don't over think it. Always use methods until the point comes where you say to yourself "it doesn't make sense to make this a method", in which case you can make a function.

Bryan Oakley
  • 370,779
  • 53
  • 539
  • 685
  • 2
    Can you explain why you default to methods over functions? (And could you explain if that rule still makes sense in the case of the stack and queue / pop and push methods?) – Ceasar Nov 13 '11 at 01:14
  • 11
    That rule-of-thumb doesn't make sense. The standard library itself can be a guide to when to use classes versus functions. *heapq* and *math* are functions because they operate on normal python objects (floats and lists) and because they don't need to maintain complex state like the *random* module does. – Raymond Hettinger Nov 13 '11 at 17:03
  • 1
    Are you saying the rule makes no sense because the standard library violates it? I don't think that conclusion makes sense. For one, rules of thumb are just that -- they aren't absolute rules. Plus, a large part of the standard library _does_ follow that rule of thumb. The OP was looking for some simple guidelines, and I think my advice is perfectly good for someone who is just starting out. I appreciate your point of view, however. – Bryan Oakley Nov 13 '11 at 17:12
  • Well the STL has some good reasons to do so. For math and co it's obviously useless to have a class since we don't have any state for it (in other languages those are classes with only static functions). For things that operate on several different containers like say `len()` it can also be argued that the design makes sense, although I'd personally think functions wouldn't be too bad there either - we'd just have another convention that `len()` always has to return an integer (but with all the additional problems of backcomp I wouldn't advocate that for python either) – Voo Nov 13 '11 at 19:18
  • -1 as this answer is completely arbitrary: you are essentially trying to demonstrate that X is better than Y by *assuming* that X is better than Y. – gented Jul 17 '18 at 08:28
6

Generally I use classes to implement a logical set of capabilities for some thing, so that in the rest of my program I can reason about the thing, not having to worry about all the little concerns that make up its implementation.

Anything that's part of that core abstraction of "what you can do with a thing" should usually be a method. This generally includes everything that can alter a thing, as the internal data state is usually considered private and not part of the logical idea of "what you can do with a thing".

When you come to higher level operations, especially if they involve multiple things, I find they are usually most naturally expressed as functions, if they can be built out of the public abstraction of a thing without needing special access to the internals (unless they're methods of some other object). This has the big advantage that when I decide to completely rewrite the internals of how my thing works (without changing the interface), I just have a small core set of methods to rewrite, and then all the external functions written in terms of those methods will Just Work. I find that insisting that all operations to do with class X are methods on class X leads to over-complicated classes.

It depends on the code I'm writing though. For some programs I model them as a collection of objects whose interactions give rise to the behavior of the program; here most important functionality is closely coupled to a single object, and so is implemented in methods, with a scattering of utility functions. For other programs the most important stuff is a set of functions that manipulate data, and classes are in use only to implement the natural "duck types" that are manipulated by the functions.

Ben
  • 68,572
  • 20
  • 126
  • 174
0

Going on further from @endolith's comments on @Thriveth's answer:

height(person) vs person.height is (I think) comparing apples to cheese:

This boils down to naming conventions and good naming of functions and members. person.height should probably be person.height_in_metres (or some such) and height(person) could/should be get_height_of(person) or what_is_height_of(person) or measure_height(person) and then the 'conflict' goes away.

The functions are doing something (even if it is only returning person.height at the very worst) whereas person.height is some property of person.

Lozminda
  • 45
  • 8
  • This should possibly be a comment, but rep. I thought @endolith's comment was a good opportunity to help clarify, as it brings up a common issue. Sometimes this problem can be fixed with just naming (which may of course be changing the programmers perceptions/functionality in line with the code..) – Lozminda Mar 31 '23 at 02:56
  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/34138411) – bfontaine Apr 04 '23 at 12:15