0

My intention was to solve the l1 error fitting problem using scipy.optimize as suggested in L1 norm instead of L2 norm for cost function in regression model, but I keep getting the wrong solution, so I debugged using least square which we know how to get a closed-form expression:

n=10
d=2
A = np.random.rand(n,d)
x = np.random.rand(d,1)
b = np.dot(A, x)
x_hat = np.linalg.lstsq(A, b)[0]
print(np.linalg.norm(np.dot(A, x_hat)-b))


def fit(X, params):
    return X.dot(params)

def cost_function(params, X, y):
    return np.linalg.norm(y - fit(X, params))

x0 = np.zeros((d,1))

output = minimize(cost_function, x0, args=(A, b))

y_hat = fit(A, output.x)
print(np.linalg.norm(y_hat-b))
print(output)

The output is:

4.726604209672303e-16
2.2714597315189407
      fun: 2.2714597315189407
 hess_inv: array([[ 0.19434496, -0.1424377 ],
       [-0.1424377 ,  0.16718703]])
      jac: array([ 3.57627869e-07, -2.98023224e-08])
  message: 'Optimization terminated successfully.'
     nfev: 32
      nit: 4
     njev: 8
   status: 0
  success: True
        x: array([0.372247  , 0.32633966])

which looks super weird because it can't even solve the l2 regression? Am I making stupid mistakes here?

desertnaut
  • 57,590
  • 26
  • 140
  • 166

1 Answers1

0

The error is due to mismatching shapes. Minimize's x0 and the iterates have shape (n,). So when params has shape (n,), fit(X,params) will be shape (n,) while your y has shape (n,1). Thus, the expression y-fit(X,params) has shape (n,n) due to numpy's implicit broadcasting. And in consequence np.linalg.norm returns the frobenius matrix norm instead of the euclidean vector norm.

Solution: change your cost function to:

def cost_function(params, X, y):
    return np.linalg.norm(y - fit(X, params.reshape(d,1)))
joni
  • 6,840
  • 2
  • 13
  • 20