6

I am about to write an algorthim for real-time applications, which involves some high dimensional NLPs (nonlinear programmings).

before implentations, I need to timing my algorithim to see whether it is feasible for real-time applications, therefore I use Matlab's built-in fmincons as a baseline.

As experience shows, matlab algorithms tend to vary from slower to magitudes slower than their C++ counterparts, so I want to estimate what kind of performance gain I can expect with this particular case?

As my work is mostly related to real-time applications, thus I rarely use NLP(nonlinear programming), so I asked my workmates, they recommend me to try ipopt as a start, I googled its website, there is no benchmarks there against Matlab, nor there is much topics regarding the details of their algorithms (at least in Matlab, it is not hard to check the details of their algorthims), so I basically have little idea about the accuracy/robustness/optimality etc. about it.

So any help here regarding NLP's C++ implenmentations will be very helpful, many thanks in advance.

shx2
  • 61,779
  • 13
  • 130
  • 153
user0002128
  • 2,785
  • 2
  • 23
  • 40
  • 1
    Search problems are very domain-dependant; the only definitive answer you're going to get is by actually having both systems solve a real problem that you might encounter, and seeing how they perform. – Isaac Nov 10 '12 at 03:09

1 Answers1

2

Many of these sort of problems are dominated by large O(n^~3) matrix multiplications. If that is the case and both systems are using the same algorithm than the performance will be similar and won't depend on the language, as the underlying matrix multiply function will be implemented natively in asm anyway.

If the algorithm isn't dominated by a simple function like this, and instead has a lot of memory management required than the C++ library will win by a lot (3-10 x as fast).

(If performance is critical than many people are using OpenCL to farm stuff off to the GPU which is designing for this type of numerical calculation, and has price/performance differences in the 20-100x range. Or you can farm it off to a cluster if you need even faster.)

Andrew Tomazos
  • 66,139
  • 40
  • 186
  • 319
  • Agree with you on many parts, but I suspect GPU can deliver any improvements here, using SQP as an example, practically the method has to has lots of branching to work properly, considering GPU's extremely poor branching peformance and very low cache (which basically completely remove any advantage GPU has over CPU in memory bandwidth), I will be surprised if GPU's implentation will not be significantly slower than their CPU counterpart. – user0002128 Nov 11 '12 at 03:39
  • The GPUs advantage over CPU is its massively parallel architecture, which has effectively thousands of processor cores vs a CPUs 10 or so. Provided your problem can be subdivided into a data parallel structure (which not all can), the GPU will win by order(s) of magnitude. http://www.youtube.com/watch?v=IEWGTpsFtt8 – Andrew Tomazos Nov 11 '12 at 05:04