0

I've been reading OOP and trying to grasp the concept of self and __init__ and I think I found an explanation that makes sense (to me at least). this is an article on building a linear regression estimator using OOP concepts.

Article Link

class MyLinearRegression:

    def __init__(self, fit_intercept=True):
        self.coef_ = None
        self.intercept_ = None
        self._fit_intercept = fit_intercept

The layman explanation is as follows:

At a high level, __init__ provides a recipe for how to build an instance of MyLinearRegression ... Since an instance of MyLinearRegression can take on any name a user gives it, we need a way to link the user’s instance name back to the class so we can accomplish certain tasks. Think of self as a variable whose sole job is to learn the name of a particular instance

so I think this makes sense. what I dont get is why self is used again in when defining new methods.

def predict(self, X):
    """
    Output model prediction.

    Arguments:
    X: 1D or 2D numpy array 
    """

    # check if X is 1D or 2D array
    if len(X.shape) == 1:
        X = X.reshape(-1,1) 
    return self.intercept_ + np.dot(X, self.coef_)

In this version. What is self referring to?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
semidevil
  • 69
  • 1
  • 11

4 Answers4

1

self (or generally first parameter of an instance method; the name self is conventional) refers to the instance itself whose method has been called. In your example, intercept_ attribute of that specific method would be accessed in the return statement.

Consider the following example:

class C:
    def m(self):
        print(self.a)

c1 = C()
c1.a = 1
c2 = C()
c2.a = 2
c1.m()  # prints 1, value of "c1.a" 
c2.m()  # prints 2, value of "c2.a" 

We have a class C and we instantiate two objects. Instance c1 and instance c2. We assign a different value to an attribute a of either instance and then we call a method m which accesses attribute a of a its instance and prints it.

Ondrej K.
  • 8,841
  • 11
  • 24
  • 39
0

If you use self as the first parameter of a function, it means that only an instance of this class can call this function. The functions in a class can be classified as class method, instance method and static method.

class method: It's a method that can be called by instance and class. Usually it's used with variables belong to class not to instance.

instance method: It's a method that can be called by only the instance of a class. Usually it's used with the variables belong to the instance.

static method: It's a method can be called by instance and class. Usually it's used with variables that belong neither to the class nor to the instance.

class X:
    x = 2

    def __init__(self):
        self.x = 1

    def instance_method(self):
        return self.x

    @classmethod
    def class_method(cls):
        return cls.x


print(X.instance_method())  # raises a TypeError
print(X().instance_method()) # not raises a TypeError, prints 1
print(X.class_method()) # not raises a TypeError, prints 2
dildeolupbiten
  • 1,314
  • 1
  • 15
  • 27
0

When you create an instance of the class MyLinearRegression i.e.

linear_regression = MyLinearRegression(fit_intercept=True)

Your linear_regression object has been initialised with the following attributes:

linear_regression.coef_ = None
linear_regression.intercept_ = None
linear_regression._fit_intercept = fit_intercept

Notice here how the "self" in the class definition refers to the object instance that we created (i.e. linear_regression)

The class method "predict" can be called as follows:

linear_regression.predict(X)

Here Python adds syntactic sugar, so under the hood the function call above is transformed as follows:

MyLinearRegression.predict(linear_regression, X)

Taking the instance "linear_regression" and inserting it inplace of "self".

Note: For additional reference you are able to see all of the attributes/methods for any object via the following:

print(dir(<insert_object_here>))

I hope this helped.

redmonkey
  • 86
  • 9
0

I think it may help to refer to what the python docs have to say about self in the random remarks of the page on classes:

Often, the first argument of a method is called self. This is nothing more than a convention: the name self has absolutely no special meaning to Python. Note, however, that by not following the convention your code may be less readable to other Python programmers, and it is also conceivable that a class browser program might be written that relies upon such a convention.

This is an important distinction to make because there's a difference depending on whether predict is in a class or not. Let's revisit an expanded version of your example:

class MyLinearRegression:
    def __init__(self, fit_intercept=True):
        self.coef_ = None
        self.intercept_ = None
        self._fit_intercept = fit_intercept

    def predict(self, X):
        """
        Output model prediction.

        Arguments:
        X: 1D or 2D numpy array 
        """

        # check if X is 1D or 2D array
        if len(X.shape) == 1:
            X = X.reshape(-1,1) 
        return self.intercept_ + np.dot(X, self.coef_)

mlr = MyLinearRegression()
mlr.predict(SomeXData)

When mlr.predict() is called, the mlr instance is passed in as the first parameter of the function predict. This is so that the predict function can refer to the class it is defined in. It's important to note that __init__ is not a special function with respect to self. All member functions accept as their first parameter a reference to the instance of the object that called the function.

This is not the only approach. Consider this alternate example:

class MyLinearRegression:
    def __init__(self, fit_intercept=True):
        self.coef_ = None
        self.intercept_ = None
        self._fit_intercept = fit_intercept

def predict(self, X):
    """
    Output model prediction.

    Arguments:
    X: 1D or 2D numpy array 
    """

    # check if X is 1D or 2D array
    if len(X.shape) == 1:
        X = X.reshape(-1,1) 
    return self.intercept_ + np.dot(X, self.coef_)

mlr = MyLinearRegression()
predict(mlr, SomeXData)

The signature for predict hasn't changed, just the way the function is called. This is why self isn't special as a parameter name. We could pass in any class to predict and it would still run, although probably with errors.

Bash
  • 628
  • 6
  • 19