65

I have a container class that holds data. When the container is created, there are different methods to pass data.

  1. Pass a file which contains the data
  2. Pass the data directly via arguments
  3. Don't pass data; just create an empty container

In Java, I would create three constructors. Here's how it would look like if it were possible in Python:

class Container:

    def __init__(self):
        self.timestamp = 0
        self.data = []
        self.metadata = {}

    def __init__(self, file):
        f = file.open()
        self.timestamp = f.get_timestamp()
        self.data = f.get_data()
        self.metadata = f.get_metadata()

    def __init__(self, timestamp, data, metadata):
        self.timestamp = timestamp
        self.data = data
        self.metadata = metadata

In Python, I see three obvious solutions, but none of them is pretty:

A: Using keyword arguments:

def __init__(self, **kwargs):
    if 'file' in kwargs:
        ...
    elif 'timestamp' in kwargs and 'data' in kwargs and 'metadata' in kwargs:
        ...
    else:
        ... create empty container

B: Using default arguments:

def __init__(self, file=None, timestamp=None, data=None, metadata=None):
    if file:
        ...
    elif timestamp and data and metadata:
        ...
    else:
        ... create empty container

C: Only provide constructor to create empty containers. Provide methods to fill containers with data from different sources.

def __init__(self):
    self.timestamp = 0
    self.data = []
    self.metadata = {}

def add_data_from_file(file):
    ...

def add_data(timestamp, data, metadata):
    ...

Solutions A and B are basically the same. I don't like doing the if/else, especially since I have to check if all arguments required for this method were provided. A is a bit more flexible than B if the code is ever to be extended by a fourth method to add data.

Solution C seems to be the nicest, but the user has to know which method he requires. For example: he cant do c = Container(args) if he doesn't know what args is.

Whats the most Pythonic solution?

smci
  • 32,567
  • 20
  • 113
  • 146
Johannes
  • 3,300
  • 2
  • 20
  • 35
  • Related: https://stackoverflow.com/questions/7113032/overloaded-functions-in-python. – Christian Dean Jun 26 '17 at 17:44
  • There are other options, too. https://stackoverflow.com/questions/19305296/multiple-constructors-in-python-using-inheritance Also, I always try to make my code fit my needs, rather than writing around them to make my code more pure. – Dave Jun 26 '17 at 17:45
  • While all the answers here are focusing on providing a solution, [Jörg W Mittag provides a very nice explanation](https://stackoverflow.com/questions/9373104/why-doesnt-ruby-support-method-overloading/9380268#9380268) about why function overloading wouldn't make sense in dynamic languages. – Christian Dean Jun 26 '17 at 18:26

7 Answers7

82

You can't have multiple methods with same name in Python. Function overloading - unlike in Java - isn't supported.

Use default parameters or **kwargs and *args arguments.

You can make static methods or class methods with the @staticmethod or @classmethod decorator to return an instance of your class, or to add other constructors.

I advise you to do:

class F:

    def __init__(self, timestamp=0, data=None, metadata=None):
        self.timestamp = timestamp
        self.data = list() if data is None else data
        self.metadata = dict() if metadata is None else metadata

    @classmethod
    def from_file(cls, path):
       _file = cls.get_file(path)
       timestamp = _file.get_timestamp()
       data = _file.get_data()
       metadata = _file.get_metadata()       
       return cls(timestamp, data, metadata)

    @classmethod
    def from_metadata(cls, timestamp, data, metadata):
        return cls(timestamp, data, metadata)

    @staticmethod
    def get_file(path):
        # ...
        pass

⚠ Never have mutable types as defaults in python. ⚠ See here.

glegoux
  • 3,505
  • 15
  • 32
  • Feel free to edit ;) sorry – glegoux Jun 26 '17 at 18:00
  • 1
    `@classmethod` would be cleaner; the approach is good. – 9000 Jun 26 '17 at 18:03
  • I think it would be a lot cleaner to have the constructor accept the three parameters, instead of always creating it with the defaults and then overwriting them. – Bergi Jun 26 '17 at 18:13
  • I like this solution, its basically the same as the one 9000 provided. I understand `@staticmethod` is sufficient, why should `@classmethod` be cleaner? – Johannes Jun 26 '17 at 18:19
  • 21
    Please, never, ever, ever, have mutable types as defaults in python. This is one of the first (and few) weird edge cases beginners need to learn in python. Try doing `x = F(); x.data.append(5); y = F(); print y.data`. You are in for a surprise. Idiomatic way would be to default to `None` instead, and assign to `self.data` and `self.metadata` within a conditional or with ternary operator. – Nir Friedman Jun 26 '17 at 19:29
  • Yes it true :) http://python-guide-pt-br.readthedocs.io/en/latest/writing/gotchas/ – glegoux Jun 26 '17 at 19:38
  • 3
    Johannes, others can correct me if I'm wrong (still new to Python), but I think it's because of inheritance. Suppose a new class, `G`, inherits class `F`. Using `@classmethod`, calling `G.from_file` gives an instance of `G`. Using `@staticmethod`, the class name is hardcoded into the method, so `G.from_file` will give an instance of `F` unless `G` overrides the method. – j_foster Jun 26 '17 at 23:02
  • 1
    @NirFriedman or `x = x or {}` (for dict), no `if` needed :) – Mark Jun 27 '17 at 14:33
  • 2
    @Mark And then if someone calls the constructor with an empty dict they intend to share with something else, it gets replaced by a new empty dict? That could lead to some nasty headscratchers: `my_dict = {}; f = F(metadata=my_dict); my_dict[1] = 2; f.metadata => {}`. Here, `f.metadata` should of course be `{1: 2}`. – marcelm Jun 27 '17 at 14:40
  • "You can't have multiple methods with same name in Python." -> Actually you can if you mess around with meta-classes and make it reroute the call to the correct method based on the arguments. Not advisable or simple, but certainly not impossible. – Anonymous Jun 27 '17 at 14:53
  • 1
    @Mark holy shit... Is this python, or lisp? Very cool. None of the guides that mention this gotcha use that trick. However you have to be rather careful as the behavior is not equivalent. I'd be curious to have some expert pythonistas weigh in on which is preferable. – Nir Friedman Jun 27 '17 at 15:17
  • 2
    @Mark Your comment inspired another question: https://stackoverflow.com/questions/44784276/idiomatic-way-to-default-mutable-arguments – Nir Friedman Jun 27 '17 at 15:30
  • @marcelm Well, my answer would be to (almost) never pass a dict to a function with the intention of having it mutated. That just leads to hard to read code and difficult to find bugs. But I'll grant that it was mostly just a cool shortcut, the functionality is inferior for some edge cases. – Mark Jun 27 '17 at 16:35
30

You can't have multiple constructors, but you can have multiple aptly-named factory methods.

class Document(object):

    def __init__(self, whatever args you need):
        """Do not invoke directly. Use from_NNN methods."""
        # Implementation is likely a mix of A and B approaches. 

    @classmethod
    def from_string(cls, string):
        # Do any necessary preparations, use the `string`
        return cls(...)

    @classmethod
    def from_json_file(cls, file_object):
        # Read and interpret the file as you want
        return cls(...)

    @classmethod
    def from_docx_file(cls, file_object):
        # Read and interpret the file as you want, differently.
        return cls(...)

    # etc.

You can't easily prevent the user from using the constructor directly, though. (If it is critical, as a safety precaution during development, you can analyze the call stack in the constructor and check that the call is made from one of the expected methods.)

9000
  • 39,899
  • 9
  • 66
  • 104
17

Most Pythonic would be what the Python standard library already does. Core developer Raymond Hettinger (the collections guy) gave a talk on this, plus general guidelines for how to write classes.

Use separate, class-level functions to initialize instances, like how dict.fromkeys() isn't the class initializer but still returns an instance of dict. This allows you to be flexible toward the arguments you need without changing method signatures as requirements change.

Arya McCarthy
  • 8,554
  • 4
  • 34
  • 56
4

What are the system goals for this code? From my standpoint, your critical phrase is but the user has to know which method he requires. What experience do you want your users to have with your code? That should drive the interface design.

Now, move to maintainability: which solution is easiest to read and maintain? Again, I feel that solution C is inferior. For most of the teams with whom I've worked, solution B is preferable to A: it's a little easier to read and understand, although both readily break into small code blocks for treatment.

Prune
  • 76,765
  • 14
  • 60
  • 81
3

I'm not sure if I understood right but wouldn't this work?

def __init__(self, file=None, timestamp=0, data=[], metadata={}):
    if file:
        ...
    else:
        self.timestamp = timestamp
        self.data = data
        self.metadata = metadata

Or you could even do:

def __init__(self, file=None, timestamp=0, data=[], metadata={}):
    if file:
        # Implement get_data to return all the stuff as a tuple
        timestamp, data, metadata = f.get_data()

    self.timestamp = timestamp
    self.data = data
    self.metadata = metadata

Thank to Jon Kiparsky advice theres a better way to avoid global declarations on data and metadata so this is the new way:

def __init__(self, file=None, timestamp=None, data=None, metadata=None):
    if file:
        # Implement get_data to return all the stuff as a tuple
        with open(file) as f:
            timestamp, data, metadata = f.get_data()

    self.timestamp = timestamp or 0
    self.data = data or []
    self.metadata = metadata or {}
  • 5
    there's a subtle bug here. since the parameter list is evaluated when the function is first created, the list and dict for data and metadata will be effectively globals. Basically reasonable though, except for that gotcha. – Jon Kiparsky Jun 26 '17 at 18:05
  • so it might be better to use keyword arguments? – Gabriel Ecker Jun 26 '17 at 18:12
  • 3
    That, or you could use `None` for the defaults and then `self.data = data or []` – Jon Kiparsky Jun 26 '17 at 18:13
3

If you are on Python 3.4+ you can use the functools.singledispatch decorator to do this (with a little extra help from the methoddispatch decorator that @ZeroPiraeus wrote for his answer):

class Container:

    @methoddispatch
    def __init__(self):
        self.timestamp = 0
        self.data = []
        self.metadata = {}

    @__init__.register(File)
    def __init__(self, file):
        f = file.open()
        self.timestamp = f.get_timestamp()
        self.data = f.get_data()
        self.metadata = f.get_metadata()

    @__init__.register(Timestamp)
    def __init__(self, timestamp, data, metadata):
        self.timestamp = timestamp
        self.data = data
        self.metadata = metadata
Sean Vieira
  • 155,703
  • 32
  • 311
  • 293
-2

The most pythonic way is to make sure any optional arguments have default values. So include all arguments that you know you need and assign them appropriate defaults.

def __init__(self, timestamp=None, data=[], metadata={}):
    timestamp = time.now()

An important thing to remember is that any required arguments should not have defaults since you want an error to be raised if they're not included.

You can accept even more optional arguments using *args and **kwargs at the end of your arguments list.

def __init__(self, timestamp=None, data=[], metadata={}, *args, **kwards):
    if 'something' in kwargs:
        # do something
Soviut
  • 88,194
  • 49
  • 192
  • 260
  • 2
    Never use mutable types as defaults - it can introduce some very subtle bugs which can be a pain to find. It is always recommended to do something like `func( data = None ): data = data or []`; one more line,, one less bug :-) http://python-guide-pt-br.readthedocs.io/en/latest/writing/gotchas/ – Tony Suffolk 66 Jun 27 '17 at 06:47