operands could not be broadcast together error in python

Question

I am trying to write the Multiple Linear Regression code by hand and for that I have written this logic,

#building the model

#lets start with a random value of m and c

import numpy as np
n_samples,n_features=X_train.shape

weight=np.zeros(n_features)
bias=0

lr=0.0001 #l is the learning rate

mse=[]

for i in range (0,20000):
    Y_pred=np.dot(X_train,weight)+bias
    mse.append(np.sum((Y_pred-Y_train)**2)/n_samples)
    
     # compute gradients
    dw = (1 / n_samples) * np.dot(X_train.T, (Y_pred - Y_train))
    db = (1 / n_samples) * np.sum(Y_pred - Y_train)
    
    # update parameters
    weight -= lr * dw
    bias -= lr * db
    
plt.plot(mse)

But I don't know why I am getting the following error

> ValueError                                Traceback (most recent call
> last) <ipython-input-17-6302f8353768> in <module>
>      22 
>      23     # update parameters
> ---> 24     weight -= lr * dw
>      25     bias -= lr * db
>      26 
> 
> ValueError: operands could not be broadcast together with shapes (6,)
> (6,40) (6,)

Please help as I am new to python and I am not able to find out where I am going wrong. Any help will be really helpful. I have seen this stackoverflow question, but I am unable to figure out my mistake. Please help.

With dummy input, it's working fine on my end. What's the shape of `X_train` and `Y_train`? — kwkt, Jan 08 '21 at 15:08
(40,6) for xtrain and (40,1) for y_train using jupyter notebook — , Jan 08 '21 at 17:12

score 1 · Answer 1 · answered Jan 09 '21 at 01:53

Change your weight initialization to this:

weight = np.zeros((n_features, 1))

Explanation: Initialization of weights as you did will create an array of shape (n_features,), which in math operations is treated as (1, n_features) (and is treated as so from here on in this answer). Then, after forward-feeding, you get Y_pred of shape (1, n_samples) instead of (n_samples, 1). This leads to Y_pred - Y_train being of shape (n_samples, n_samples) instead of (n_samples, 1).

The way broadcasting works in NumPy is that first it tries to match dimensions, if any of them match then they are good. Then, for the mismatch dimensions, if one of them is 1, then it gets broadcasted to be the shape of the dimension of the other array. In your example, during the subtraction, Y_pred gets broadcasted in the x-axis and Y_train in the y-axis:

$\huge y_p - y_t = \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt} & y_p & \rule[.5ex]{2.5ex}{0.5pt} \\ \rule[.5ex]{2.5ex}{0.5pt} & y_p & \rule[.5ex]{2.5ex}{0.5pt} \\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt} & y_p & \rule[.5ex]{2.5ex}{0.5pt} \\ \end{bmatrix} - \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex} & \rule[-1ex]{0.5pt}{2.5ex} & & \rule[-1ex]{0.5pt}{2.5ex} \\ y_t & y_t & \hdots & y_t \\ \rule[-1ex]{0.5pt}{2.5ex} & \rule[-1ex]{0.5pt}{2.5ex} & & \rule[-1ex]{0.5pt}{2.5ex} \\ \end{bmatrix}$

operands could not be broadcast together error in python

1 Answers1