I'm trying to implement a simple logistic regression example in Clojure using the Incanter data analysis library. I've successfully coded the Sigmoid and Cost functions, but Incanter's BFGS minimization function seems to be causing me quite some trouble.
(ns ml-clj.logistic
(:require [incanter.core :refer :all]
[incanter.optimize :refer :all]))
(defn sigmoid
"compute the inverse logit function, large positive numbers should be
close to 1, large negative numbers near 0,
z can be a scalar, vector or matrix.
sanity check: (sigmoid 0) should always evaluate to 0.5"
[z]
(div 1 (plus 1 (exp (minus z)))))
(defn cost-func
"computes the cost function (J) that will be minimized
inputs:params theta X matrix and Y vector"
[X y]
(let
[m (nrow X)
init-vals (matrix (take (ncol X) (repeat 0)))
z (mmult X init-vals)
h (sigmoid z)
f-half (mult (matrix (map - y)) (log (sigmoid (mmult X init-vals))))
s-half (mult (minus 1 y) (log (minus 1 (sigmoid (mmult X init-vals)))))
sub-tmp (minus f-half s-half)
J (mmult (/ 1 m) (reduce + sub-tmp))]
J))
When I try (minimize (cost-func X y) (matrix [0 0]))
giving minimize
a function and starting params the REPL throws an error.
ArityException Wrong number of args (2) passed to: optimize$minimize clojure.lang.AFn.throwArity (AFn.java:437)
I'm very confused as to what exactly the minimize function is expecting.
For reference, I rewrote it all in python, and all of the code runs as expected, using the same minimization algorithm.
import numpy as np
import scipy as sp
data = np.loadtxt('testSet.txt', delimiter='\t')
X = data[:,0:2]
y = data[:, 2]
def sigmoid(X):
return 1.0 / (1.0 + np.e**(-1.0 * X))
def compute_cost(theta, X, y):
m = y.shape[0]
h = sigmoid(X.dot(theta.T))
J = y.T.dot(np.log(h)) + (1.0 - y.T).dot(np.log(1.0 - h))
cost = (-1.0 / m) * J.sum()
return cost
def fit_logistic(X,y):
initial_thetas = np.zeros((len(X[0]), 1))
myargs = (X, y)
theta = sp.optimize.fmin_bfgs(compute_cost, x0=initial_thetas,
args=myargs)
return theta
outputting
Current function value: 0.594902
Iterations: 6
Function evaluations: 36
Gradient evaluations: 9
array([ 0.08108673, -0.12334958])
I don't understand why the Python code can run successfully, but my Clojure implementation fails. Any suggestions?
Update
rereading the docstring for minimize
i've been trying to calculate the derivative of cost-func
which throws a new error.
(def grad (gradient cost-func (matrix [0 0])))
(minimize cost-func (matrix [0 0]) (grad (matrix [0 0]) X))
ExceptionInfo throw+: {:exception "Matrices of different sizes cannot be differenced.", :asize [2 1], :bsize [1 2]} clatrix.core/- (core.clj:950)
using trans
to convert the 1xn col matrix to a nx1 row matrix just yields the same error with opposite errors.
:asize [1 2], :bsize [2 1]}
I'm pretty lost here.