5

I am trying to get consistent answers for a simple optimization problem, between two functions in MATLAB and Octave. Here is my code:

  options = optimset('MaxIter', 500 , 'Display', 'iter', 'MaxFunEvals', 1000);

  objFunc = @(t) lrCostFunction(t,X,y);

  [result1] = fminsearch(objFunc, theta, options);
  [result2]=  fmincg (objFunc, theta, options);

(Bear in mind, that X, y, and theta are defined earlier and are correct). The problem is the following: When I run the above code in MATLAB with it using fmincg, (commend out fminsearch), I get the correct answer.

However, if I comment out fmincg and let us run fminsearch, I get no conversion whatsoever. In fact the output looks like this:

   491          893         0.692991         reflect
   492          894         0.692991         reflect
   493          895         0.692991         reflect
   494          896         0.692991         reflect
   495          897         0.692991         reflect
   496          898         0.692991         reflect
   497          899         0.692991         reflect
   498          900         0.692991         reflect
   499          901         0.692991         reflect
   500          902         0.692991         reflect



Exiting: Maximum number of iterations has been exceeded
         - increase MaxIter option.
         Current function value: 0.692991 

Increasing the number of iterations doesnt do jack. In contrast, when using the fmincg, I see it converging, and it finally gives me the correct result:

Iteration     1 | Cost: 2.802128e-001
Iteration     2 | Cost: 9.454389e-002
Iteration     3 | Cost: 5.704641e-002
Iteration     4 | Cost: 4.688190e-002
Iteration     5 | Cost: 3.759021e-002
Iteration     6 | Cost: 3.522008e-002
Iteration     7 | Cost: 3.234531e-002
Iteration     8 | Cost: 3.145034e-002
Iteration     9 | Cost: 3.008919e-002
Iteration    10 | Cost: 2.994639e-002
Iteration    11 | Cost: 2.678528e-002
Iteration    12 | Cost: 2.660323e-002
Iteration    13 | Cost: 2.493301e-002

.
.
.


Iteration   493 | Cost: 1.311466e-002
Iteration   494 | Cost: 1.311466e-002
Iteration   495 | Cost: 1.311466e-002
Iteration   496 | Cost: 1.311466e-002
Iteration   497 | Cost: 1.311466e-002
Iteration   498 | Cost: 1.311466e-002
Iteration   499 | Cost: 1.311466e-002
Iteration   500 | Cost: 1.311466e-002

This gives the correct asnwer.

So what gives? Why is fminsearch not working in this minimization case?

Additional context:

1) Octave is the language that has fmincg btw, however a quick google result also retrieves this function. My MATLAB can call either.

2) My problem has a convex error surface, and its error surface is everywhere differentiable.

3) I only have access to fminsearch, fminbnd (which I cant use since this problem is multivariate not univariate), so that leaves fminsearch. Thanks!

Spacey
  • 2,941
  • 10
  • 47
  • 63
  • 2
    If I'm not mistaken, this is related to the online ML class from Stanford. If I remember correctly, `fmincg` is not a function shipped with MATLAB but was provided as part of the homework. Thus you can't expect them to implement the exact same algorithm. Check this [page](http://www.mathworks.com/help/techdoc/math/bsotu2d.html#bsgpq6p-11) to learn about the FMINSEARCH algorithm. – Amro May 27 '12 at 01:17
  • For what its worth, `fminsearch` is implemented in [octave-forge](http://octave.sourceforge.net/optim/function/fminsearch.html) – Amro May 27 '12 at 01:20
  • @Amro Yes, its from the class. I dont expect them to be the same - however the main problem is that fminsearch is simply not working at all, and I am trying to figure out why. The problem is convex and everywhere differentiable, so I do not know why it would not work at all...what I am trying to do is not have to use/depend on Octave's functions, and use my native MATLAB functions like fminsearch. – Spacey May 27 '12 at 01:23
  • Ok I just looked it up, and there was a comment in the class that "fmincg works similarly to fminunc, but is more efficient when we are dealing with large number of parameters". Also as @eakbas explained, while `fminsearch` is indeed used to solve non-linear unconstrained optimizations, it uses a "no derivatives" method (Simplex algorithm) which is different from what `fminunc` does (it needs gradient of the function, or it computes an approximation using finite-differences) – Amro May 27 '12 at 01:32
  • @Amro - yes, I am aware that fminsearch is not computing/using gradients while fmincg/fminunc is. But then the question becomes why doesnt fminsearch work? ...There is nothing in the problem formulation that says gradient must be used... if the problem is convex, every differentiable, then there should be no problem. So why doesnt fminsearch work? – Spacey May 27 '12 at 01:37
  • @Learnaholic don't show the start of the algorithm, show the end. Does it terminate? What does it say? Also, what is the size of your input to be minimized? – Rasman May 27 '12 at 02:00
  • Also, have you tried changing your tolerance values? – Rasman May 27 '12 at 02:07
  • @Rasman The tolerance values matter for stopping the convergence - but over here we have no convergence to speak of... (also please see edits). – Spacey May 27 '12 at 02:19
  • 2
    @Learnaholic: I don't claim to be an expert on the subject, but according to [Wikipedia](http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method) the Nelder–Mead simplex method (used by `fminsearch`) can converge to non-stationary points on some problems that can be solved by alternative methods. Apparently the problem must satisfy stronger conditions than are necessary in other methods. I'm not really motivated to dig any deeper, perhaps you can learn more from this [paper](http://dx.doi.org/10.1137%2FS1052623496303482).. good luck to you :) – Amro May 27 '12 at 02:20
  • @Amro Ah! That is still good to know thanks! - I half suspected that there must be some 'stronger' condition than just being convex. (FFUUUUU). Sigh. I suppose that must be it. :-/ – Spacey May 27 '12 at 02:23

2 Answers2

6

I assume that fmincg is implementing a conjugate-gradient type optimization. fminsearch is a derivative-free optimization method. So, why do you expect them to give the same results. They are completely different algorithms.

I would expect fminsearch to find the global minima for a convex cost function. At least, this has been my experience so far.

The first line of fminsearch's output suggest that objFunc(theta) is ~0.69 but this value is very different than the cost values in fmincg's output. So, I would look for possible bugs outside fminsearch. Make sure you are giving the same cost function and initial point to both algorithms.

emrea
  • 1,335
  • 9
  • 18
  • The problem is that fminsearch doesnt even begin to converge at all, ever. The method in this case should not matter for this case. The problem is convex, as fminsearch expects. So why wouldnt it work? – Spacey May 27 '12 at 01:13
  • What do you mean by "begin to converge". Have you run it until convergence? Did you try changing its parameters like maximum number of function evaluations, tolerances? I would expect fminsearch to find the global minima for a convex cost function. At least, this is my experience so far. – emrea May 27 '12 at 01:16
  • I have changed number of iterations, number of max function evals, and no difference. It does not budge from that figure of 0.692991. I have also edited with some more info. Yes its convex and everywhere differentiable. – Spacey May 27 '12 at 01:21
  • So, you ran fminsearch until convergence. Does it give you a wrong result? Does it return the initial point as the result? What happens if you change the initial point? – emrea May 27 '12 at 01:25
  • 1) I dont know about it running to convergence. It is not converging to anything. 2) If intial point is all zeros, it returns all 1.9e-4. If I make the initial point anything else, yes, it will return that initial point. 3) Yes, it will return initial point as result. – Spacey May 27 '12 at 01:32
  • And, 1.9e-4 is not the correct solution? I cannot think of any reasons on why fminsearch doesn't give you the correct solution for a convex function. I would look for a bug outside of fminsearch, e.g. how you are calling it, is the cost function setup correctly, ... – emrea May 27 '12 at 01:38
  • Also, do some sanity checking. Look at the actual value of the cost by calling: objFunction(theta); objFunction(result1); objFunction(result2); – emrea May 27 '12 at 01:39
  • eakbas, the data is correct, otherwise fmincg would not give me the correct answer. fmincg gives correct answer. fminsearch does not. The most obvious problem is that fminsearch doesnt even *begin* to budge as you can see from the output. That must give clues as to the underlying problem. Seeing as how this is my first time using fminsearch I am open to suggestions, but the data is definitely correct. – Spacey May 27 '12 at 01:43
  • I assume that 0.692991 in fminsearch's output is the current value of the cost. Why are they so different than the cost values in fmincg's output? How do these values compare to 'objFunction(theta)'? – emrea May 27 '12 at 01:48
  • Thats a good question, and I could not answer that initially even. This makes me think I am not using fminsearch correctly. Yes, the correct answer is 0.0137, which fmincg finds, no problem, whereas the fminsearch gives that 0.692991. I have no clue how or why they could give different results. Yes, if I manually do optFunc(result2) I get the correct 0.0137. If I manually do optFunc(result1) I get the wrong 0.69 – Spacey May 27 '12 at 01:59
  • Looking at the first line of fminsearch's output, objFunc(theta) should be 0.693147. If you are giving the same cost function and starting point to fmincg, then I would expect to see the same cost value (0.693147) as the cost at iteration 1 in fmincg's output, yet it's so different. – emrea May 27 '12 at 02:11
  • eakdas Yes that is indeed very strange. In fact objFunc(theta_initial) is equal to 0.69. I do not know why fmincg is giving a different initial cost result - but the final correct result! O_o – Spacey May 27 '12 at 02:16
  • I revisited this, and I just wanted to clear this minor detail. If look at the source code of `fmincg` you will see that it starts printing the function value beginning at iteration 1 (ie: it does not print it at iteration 0 (initial cost), which was evaluated at line 78). So that was not the real issue, but I've already hinted at what I suspect is the problem in the comments above. To double check, I added a print statement immediately after line 78, and this is the [output](http://pastebin.com/UzsPfVjz) I get. As you can see the initial value is correctly at `6.931472e-01` – Amro May 27 '12 at 21:19
1

This is problem I've noticed sometimes with this algorithm. It may not be the answer you are looking for, but what seems to work for me, in these cases, is to modify the tolerance values at which it terminates. What I see is an oscillation between two points providing equal results. I know this happens in LabView, and can only speculate that it happens in Matlab.

Unless I see you data, I can't comment more, but that is what I suggest.

Note: by increasing the tolerance, the goal is to catch the algorithm before it reaches that state. It becomes less precise, but usually the number of significant digits is rather small anyways.

Rasman
  • 5,349
  • 1
  • 25
  • 38
  • What I am wondering Rasman is that if the algorithm is not even beginning to move from the get go, why/how would the tolerances matter? – Spacey May 27 '12 at 02:43
  • @Learnaholic on the contrary, the Func-count jumped to 400 almost immediately – Rasman May 27 '12 at 02:47
  • @Learnaholic, which, btw, made me wonder about the size of t. The algorithm doesn't respond too well to giant inputs. – Rasman May 27 '12 at 02:57
  • Rasman, I am not sure I understand what you mean about the tolerance still... to my mind, does the tolerance only matter after you are beginning to converge? BTW size of t is 400. – Spacey May 27 '12 at 03:00
  • ok, so my solution *may* be no good. FTR, fminsearch will struggle to converge on this (although it still may). The goal of changing the tolerance is that you assume you are deep enough down the rabbit hole that the small reflection that is causing your problem isn't worth effort. In your case however, you may still have a long way to go – Rasman May 27 '12 at 03:09
  • btw, you should still get a result if you hit your maxiter. How does it compare to the other routines? – Rasman May 27 '12 at 03:21