153

The question I'm about to ask seems to be a duplicate of Python's use of __new__ and __init__?, but regardless, it's still unclear to me exactly what the practical difference between __new__ and __init__ is.

Before you rush to tell me that __new__ is for creating objects and __init__ is for initializing objects, let me be clear: I get that. In fact, that distinction is quite natural to me, since I have experience in C++ where we have placement new, which similarly separates object allocation from initialization.

The Python C API tutorial explains it like this:

The new member is responsible for creating (as opposed to initializing) objects of the type. It is exposed in Python as the __new__() method. ... One reason to implement a new method is to assure the initial values of instance variables.

So, yeah - I get what __new__ does, but despite this, I still don't understand why it's useful in Python. The example given says that __new__ might be useful if you want to "assure the initial values of instance variables". Well, isn't that exactly what __init__ will do?

In the C API tutorial, an example is shown where a new Type (called a "Noddy") is created, and the Type's __new__ function is defined. The Noddy type contains a string member called first, and this string member is initialized to an empty string like so:

static PyObject * Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    .....

    self->first = PyString_FromString("");
    if (self->first == NULL)
    {
       Py_DECREF(self);
       return NULL;
    }

    .....
}

Note that without the __new__ method defined here, we'd have to use PyType_GenericNew, which simply initializes all of the instance variable members to NULL. So the only benefit of the __new__ method is that the instance variable will start out as an empty string, as opposed to NULL. But why is this ever useful, since if we cared about making sure our instance variables are initialized to some default value, we could have just done that in the __init__ method?

Community
  • 1
  • 1
Channel72
  • 24,139
  • 32
  • 108
  • 180
  • "The question I'm about to ask seems to be a duplicate of..." To the extent that any information here isn't there already, it **belongs there**. – Karl Knechtel Sep 05 '22 at 05:14
  • That may be the case in theory, but the practice is that StackOverflow offers no good way for someone to say "Hey, this existing question doesn't actually answer my question, please give a better answer" other than asking a new question. – linkhyrule5 Aug 19 '23 at 08:23

6 Answers6

158

The difference mainly arises with mutable vs immutable types.

__new__ accepts a type as the first argument, and (usually) returns a new instance of that type. Thus it is suitable for use with both mutable and immutable types.

__init__ accepts an instance as the first argument and modifies the attributes of that instance. This is inappropriate for an immutable type, as it would allow them to be modified after creation by calling obj.__init__(*args).

Compare the behaviour of tuple and list:

>>> x = (1, 2)
>>> x
(1, 2)
>>> x.__init__([3, 4])
>>> x # tuple.__init__ does nothing
(1, 2)
>>> y = [1, 2]
>>> y
[1, 2]
>>> y.__init__([3, 4])
>>> y # list.__init__ reinitialises the object
[3, 4]

As to why they're separate (aside from simple historical reasons): __new__ methods require a bunch of boilerplate to get right (the initial object creation, and then remembering to return the object at the end). __init__ methods, by contrast, are dead simple, since you just set whatever attributes you need to set.

Aside from __init__ methods being easier to write, and the mutable vs immutable distinction noted above, the separation can also be exploited to make calling the parent class __init__ in subclasses optional by setting up any absolutely required instance invariants in __new__. This is generally a dubious practice though - it's usually clearer to just call the parent class __init__ methods as necessary.

ncoghlan
  • 40,168
  • 10
  • 71
  • 80
  • 1
    the code you refer to as "boilerplate" in `__new__` isn't boilerplate, because boilerplate never changes. Sometimes you need to replace that particular code with something different. – Miles Rout Jan 23 '13 at 07:34
  • 17
    Creating, or otherwise acquiring, the instance (usually with a `super` call) and returning the instance are necessary parts of any `__new__` implementation, and the "boilerplate" I am referring to. By contrast, `pass` is a valid implementation for `__init__` - there is no required behaviour whatsoever. – ncoghlan Jan 24 '13 at 10:08
  • 1
    This is the clearest answer on the topic. Extra remark for readers: `tuple.__init__` does nothing because it resolves to `object.__init__` (which does nothing), whereas `list.__init__` (which initialises the instance) overrides `object.__init__`. – Géry Ogam Apr 08 '22 at 18:51
  • Excellent explaination. – Algo Jun 29 '22 at 09:57
47

__new__() can return objects of types other than the class it's bound to. __init__() only initializes an existing instance of the class.

>>> class C(object):
...   def __new__(cls):
...     return 5
...
>>> c = C()
>>> print type(c)
<type 'int'>
>>> print c
5
Miles Rout
  • 1,204
  • 1
  • 13
  • 26
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • This is the leanest explanation so far. – Tarik Nov 12 '11 at 18:05
  • Not quite true. I have `__init__` methods which contain code that looks like `self.__class__ = type(...)`. That causes the object to be of a different class than the one you thought you were creating. I can't actually change it into an `int` as you did... I get an error about heap types or something... but my example of assigning it to a dynamically created class works. – ArtOfWarfare Jan 21 '16 at 18:52
  • I, too, am confused about when `__init__()` is called. For example, in [lonetwin's answer](https://stackoverflow.com/a/43370010/355230), either `Triangle.__init__()` or `Square.__init__()` automatically get called depending on which type `__new__()` returns. From what you say in your answer (and I've read this elsewhere), it doesn't seem like either of them should be since `Shape.__new__()` **isn't** returning an instance of `cls` (nor one of a subclass of it). – martineau Jun 29 '18 at 20:54
  • 3
    @martineau: The `__init__()` methods in lonetwin's answer get called when the individual objects get instantiated (i.e. when *their* `__new__()` method returns), not when `Shape.__new__()` returns. – Ignacio Vazquez-Abrams Jun 29 '18 at 20:56
  • 1
    Ahh, right, `Shape.__init__()` (if it had one) wouldn't be called. Now it's all making more sense... `:¬)` – martineau Jun 29 '18 at 21:01
45

There are probably other uses for __new__ but there's one really obvious one: You can't subclass an immutable type without using __new__. So for example, say you wanted to create a subclass of tuple that can contain only integral values between 0 and size.

class ModularTuple(tuple):
    def __new__(cls, tup, size=100):
        tup = (int(x) % size for x in tup)
        return super(ModularTuple, cls).__new__(cls, tup)

You simply can't do this with __init__ -- if you tried to modify self in __init__, the interpreter would complain that you're trying to modify an immutable object.

senderle
  • 145,869
  • 36
  • 209
  • 233
  • 1
    I don't understand why should we use super? I mean why should __new__ return an instance of the superclass? Furthermore, as your put it, why should we pass cls explicitly to __new__? super(ModularTuple, cls) doesn't return a bound method? – Alcott Sep 19 '11 at 13:02
  • 4
    @Alcott, I think you're misunderstanding the behavior of `__new__`. We pass `cls` explicitly to `__new__` because, as you can read [here](http://docs.python.org/reference/datamodel.html#object.__new__) `__new__` _always_ requires a type as its first argument. It then returns an instance of that type. So we aren't returning an instance of the superclass -- we're returning an instance of `cls`. In this case, it's just the same as if we had said `tuple.__new__(ModularTuple, tup)`. – senderle Sep 19 '11 at 19:46
  • 1
    @senderle. can we say that the satement ```super(ModularTuple, cls).__new__(cls, tup)``` is same as ```super().__new__(cls, tup)``` – sakeesh Aug 26 '20 at 01:23
  • 2
    @sakeesh ah, this is a very old answer, and does not account for the Python 3 way of doing things. In Python 3, I think you are right, but I will have to look up the details to be sure. – senderle Aug 26 '20 at 18:48
15

Not a complete answer but perhaps something that illustrates the difference.

__new__ will always get called when an object has to be created. There are some situations where __init__ will not get called. One example is when you unpickle objects from a pickle file, they will get allocated (__new__) but not initialised (__init__).

Noufal Ibrahim
  • 71,383
  • 13
  • 135
  • 169
  • Would I call __init__ from __new__ if I wanted memory to be allocated and data to be initialized? Why if __new__ doesn't exist when creating instance __init__ gets called? – redpix_ Dec 22 '15 at 09:00
  • 2
    The job of the `__new__` method is to *create* (this implies memory allocation) an instance of the class and return it. The initialisation is a separate step and it's what is usually visible to the user. Please ask a separate question if there's a specific problem you're facing. – Noufal Ibrahim Dec 22 '15 at 14:06
9

Just want to add a word about the intent (as opposed to the behavior) of defining __new__ versus __init__.

I came across this question (among others) when I was trying to understand the best way to define a class factory. I realized that one of the ways in which __new__ is conceptually different from __init__ is the fact that the benefit of __new__ is exactly what was stated in the question:

So the only benefit of the __new__ method is that the instance variable will start out as an empty string, as opposed to NULL. But why is this ever useful, since if we cared about making sure our instance variables are initialized to some default value, we could have just done that in the __init__ method?

Considering the stated scenario, we care about the initial values of the instance variables when the instance is in reality a class itself. So, if we are dynamically creating a class object at runtime and we need to define/control something special about the subsequent instances of this class being created, we would define these conditions/properties in a __new__ method of a metaclass.

I was confused about this until I actually thought about the application of the concept rather than just the meaning of it. Here's an example that would hopefully make the difference clear:

a = Shape(sides=3, base=2, height=12)
b = Shape(sides=4, length=2)
print(a.area())
print(b.area())

# I want `a` and `b` to be an instances of either of 'Square' or 'Triangle'
# depending on number of sides and also the `.area()` method to do the right
# thing. How do I do that without creating a Shape class with all the
# methods having a bunch of `if`s ? Here is one possibility

class Shape:
    def __new__(cls, sides, *args, **kwargs):
        if sides == 3:
            return Triangle(*args, **kwargs)
        else:
            return Square(*args, **kwargs)

class Triangle:
    def __init__(self, base, height):
        self.base = base
        self.height = height

    def area(self):
        return (self.base * self.height) / 2

class Square:
    def __init__(self, length):
        self.length = length

    def area(self):
        return self.length*self.length

Note this is just an demonstartive example. There are multiple ways to get a solution without resorting to a class factory approach like above and even if we do choose to implelent the solution in this manner, there are a little caveats left out for sake of brevity (for instance, declaring the metaclass explicitly)

If you are creating a regular class (a.k.a a non-metaclass), then __new__ doesn't really make sense unless it is special case like the mutable versus immutable scenario in ncoghlan's answer answer (which is essentially a more specific example of the concept of defining the initial values/properties of the class/type being created via __new__ to be then initialized via __init__).

martineau
  • 119,623
  • 25
  • 170
  • 301
lonetwin
  • 971
  • 10
  • 17
0

One particular use of __new__ is to make the class a singleton:

class SingletonClass(object):
  def __new__(cls):
    if not hasattr(cls, 'instance'):
      cls.instance = super(SingletonClass, cls).__new__(cls)
    return cls.instance 

(source: Singleton Pattern in Python - A Complete Guide - GeeksforGeeks)

vicmortelmans
  • 616
  • 6
  • 16
  • Just one caveat to the above: initialization should be done inside the `__new__` function. If you have a separate `__init__` function, it will be executed every time the constructor is called, which is probably not what you want. – vicmortelmans Apr 08 '21 at 19:43