28

I'm talking mostly about Python here, but I suppose this probably holds for most languages. If I have a mutable object, is it a bad idea to make an in-place operation also return the object? It seems like most examples just modify the object and return None. For example, list.sort.

asmeurer
  • 86,894
  • 26
  • 169
  • 240
  • I think it's all about consistency. Python is pretty consistent about methods on mutable objects being in-place operations. As long as you are consistent about it, there shouldn't be an issue with in-place operations returning an object or object reference. – Joel Cornett Oct 25 '12 at 06:30
  • But why is it like that in the first place? – asmeurer Oct 25 '12 at 06:38
  • 2
    I'm not 100% sure, but most of the time, there's no need for an in-place operation to return an object. You're not creating a new object that needs to assigned, after all. Additionally, there are analogs to each in-place operation to make the fact that you're returning something to do further operations on it explicitly obvious. (e.g. `list.sort` vs. `sorted(list)`, `list.reverse` vs. `reversed(list)`) – Joel Cornett Oct 25 '12 at 06:43

4 Answers4

34

Yes, it is a bad idea. The reason is that if in-place and non-in-place operations have apparently identical output, then programmers will frequently mix up in-place operations and non-in-place operations (List.sort() vs. sorted()) and that results in hard-to-detect errors.

In-place operations returning themselves can allow you to perform "method chaining", however, this is bad practice because you may bury functions with side-effects in the middle of a chain by accident.

To prevent errors like this, method chains should only have one method with side-effects, and that function should be at the end of the chain. Functions before that in the chain should transform the input without side-effects (for instance, navigating a tree, slicing a string, etc.). If in-place operations return themselves then a programmer is bound to accidentally use it in place of an alternative function that returns a copy and therefore has no side effects (again, List.sort() vs. sorted()) which may result in an error that is difficult to debug.

This is the reason Python standard library functions always either return a copy or return None and modify objects in-place, but never modify objects in-place and also return themselves. Other Python libraries like Django also follow this practice (see this very similar question about Django).

Community
  • 1
  • 1
Andrew Gorcester
  • 19,595
  • 7
  • 57
  • 73
  • Agreed as a general rule, but I think there are exceptions in some specific cases, which are not so rare. Eg1: When the semantic of the method is clearly an in place operation, like jQuery's `.empty()`. Eg2: When the API is so commonly used that everybody knows it from the first getting started and has no version that return's a copy, like jQuery's `.append()` – Samuel Rossille Nov 26 '12 at 23:01
  • Just because the method name is a present tense verb, doesn't mean that people will find it obvious that the operation acts in-place. It took me a long time after learning Python before I always remembered that `list.sort` acted in-place, even though the name sounds like it should do that. – asmeurer Nov 27 '12 at 04:26
  • Isn't there still confusion with ending a chain with an in-place operation? `a.sort()` and `a[:2].sort()` are going to do completely different things (I guess it would be different if you used something like a numpy `array`, which uses views). Maybe the point is that `sort` returning `None` protects you from thinking that `a[:2].sort()` does anything useful? – asmeurer Nov 27 '12 at 19:36
  • 1
    Yes, the point is that `sort` fails immediately when it is used for its return value (because it returns `None`), instead of silently causing a side-effect that the developer may not intend to cause. – Andrew Gorcester Nov 27 '12 at 20:32
10

Returning the modified object from the method that modified it can have some benefits, but is not recommended in Python. Returning self after a modification operation will allow you to perform method chaining on the object, which is a convenient way of executing several methods on the same object, it's a very common idiom in object-oriented programming. And in turn, method chaining allows a straightforward implementation of fluent interfaces. Also, it allows some functional-programming idioms to be expressed more easily.

To name a few examples: in Python, the Moka library uses method chaining. In Java, the StringBuilder class allows multiple append() invocations on the same object. In JavaScript, JQuery uses method chaining extensively. Smalltalk takes this idea to the next level: by default, all methods return self unless otherwise specified (therefore encouraging method chaining) - contrast this with Python, which returns None by default.

The use of this idiom is not common in Python, because Python abides by the Command/Query Separation Principle, which states that "every method should either be a command that performs an action, or a query that returns data to the caller, but not both".

All things considered, whether it's a good or bad idea to return self at the end is a matter of programming culture and convention, mixed with personal taste. As mentioned above, some programming languages encourage this (like Smalltalk) whereas others discourage it (like Python). Each point of view has advantages and disadvantages, open to heated discussions. If you're a by-the-book Pythonist, better refrain from returning self - just be aware that sometimes it can be useful to break this rule.

Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • 1
    Thanks for the excellent answer, especially the link to the Command/Query Separation Principle, which helps me attach a label to some design trade-offs I've been mulling over lately. – FMc Jun 22 '13 at 18:42
1

The answers here about not returning from in-place operations messed me up for a bit until I came across this other SO post that links to the Python documentation (which I thought I read, but must have only skimmed). The documentation, in reference to in-place operators, says:

These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self).

When I tried to use the in-place operation without returning self, then it became None. In this example, it will say vars requires an object with __dict__. Looking at the type of self there shows None.

# Skipping type enforcement and such.
from copy import copy
import operator
import imported_utility # example.
class A:
    def __init__(self, a, b):
        self.a = a
        self.b = b
    def one(self, scaler):
        self *= scaler
        return imported_utility(vars(self))
    def two(self, scaler):
        tmp = self * scaler
        return imported_utility(vars(tmp))
    def three(self, scaler):
        return imported_utility(vars(self * scaler))
    # ... addition, subtraction, etc.; as below.
    def __mul__(self, other):
        tmp = copy(self)
        tmp._inplace_operation(other, operator.imul)
        return tmp
    def __imul__(self, other): # fails.
        self._inplace_operation(other, operator.imul)
    # Fails for __imul__.
    def _inplace_operation(self, other, op):
        self.a = op(self.a, other)
        self.b = op(self.b, other)

* works (two and three), but *= (one) does not until self is returned.

    def __imul__(self, other):
        return self._inplace_operation(other, operator.imul)
    def _inplace_operation(self, other, op):
        self.a = op(self.a, other)
        self.b = op(self.b, other)
        return self

I do not fully understand this behavior, but a follow-up comment to the referenced post, says without returning self, the in-place method is truly modifying that object, but rebinding its name to None. Unless self is returned, Python does not know what to rebind to. That behavior can be seen by keeping a separate reference to the object.

Kevin
  • 2,234
  • 2
  • 21
  • 26
0

I suppose it depends on the use case. I don't see why returning an object from an in-place operation would hurt, other than maybe you won't use the result, but that's not really a problem if you're not being super-fastidious about pure functionalism. I like the call-chaining pattern, such as jQuery uses, so I appreciate it when functions return the object they've acted upon, in case I want to use it further.

Peter Hull
  • 749
  • 6
  • 4