1

I am optimizing a rather messy likelihood function in Matlab, where I need to run about 1,000 separate runs of the optimization algorithm (fmincon) at different initial points, where there are something like 32 free parameters.

Unfortunately I can not both parallelize the 1,000 runs of the optimization algorithm, and the computation of the finite difference gradient simultaneously. I must choose one.

Does anyone know if its more efficient to parallelize the outer loop and have each optimization run on its own core, or the calculation of the finite-difference gradient computation?

Thanks!

  • 1
    I don't know if it's applicable to your situation, but check if you can use a `gpuarray`. It only works if you have a Nvidia GPU with cuda cores as far as i know, but it could give you 1000 cores or more to work with. Not guaranteeing that it'd make your specific program faster, but could be worth a try. – maxb Jan 21 '17 at 17:45
  • Thanks! I'll check it out. Actually in the market for a new video card now. I'll keep this in mind. – hipHopMetropolisHastings Jan 22 '17 at 19:17

1 Answers1

2

This is impossible to answer exactly without knowing anything about your code and or hardware.

If you have more than 32 cores, then some of them will have nothing to do during parallel gradient computation. In this case, running the 1000 simulations in parallel might be faster.

On the other hand, computing the gradients in parallel might enable your CPU(s) to use their caches more efficiently, in that there will be fewer cache misses. You may have a look at Why does the order of the loops affect performance when iterating over a 2D array? or What is “cache-friendly” code?.

Lumen
  • 3,554
  • 2
  • 20
  • 33
  • Thank you! I have 25 cores to work with. I know I ultimately need to do some tests but thought someone more knowledgeable than I might already know. I'll check out those links – hipHopMetropolisHastings Jan 20 '17 at 20:17