59

I am aware that C/C++ is a lower-level language and generates relatively optimized machine code when we compare with any other high-level language. But I guess there is pretty much more than that, which is also evident from the practice.

When I do simple calculations like montecarlo averaging of a Gaussian sample collection or so, I see there is not much of a difference between a C++ implementation or MATLAB implementation, sometimes in fact MATLAB performs a bit better in time.

When I move on to larger scale simulations with thousands of lines of code, slowly the real picture shows up. C++ simulations show superior performance like 100x better in time complexity than an equivalent MATLAB implementation.

The code in C++ most of the times, is pretty much serial and no hi-fi optimization is done explicitly. Whereas, as per my awareness, MATLAB inherently does a lot of optimization. This shows up for example when I try to generate a huge chunk of random samples, where as the equivalent in C++ using some library like IT++/GSL/Boost performs relatively slower (the algorithm used is the same namely mt19937).

My question is simply to know if there is a simpler tradeoff between MATLAB/C++ in performance. Is it just like what people say, "Whenever you can, C/C++ is the better"(The frequently experienced)?. In a different perspective, "What is MATLAB good for, other than comfort?"

By the way, I don't see coding efficiency parameter being significant here, thinking of the same programmer in both cases. And also, I think the other alternatives like python,R are not relevant here. But dependence on the specific libraries we use should be interesting.

[I am a phd student in Coding Theory in communication systems. I do simulations using matlab/C++ all the time, and have reasonable experience of coding few 10K's of lines in both cases]

Loves Probability
  • 929
  • 1
  • 9
  • 15
  • 2
    Well, from a performance perspective, matlab is better when you know how to code it, and dont know how to code c++, c++ is better the rest of the time. – RichardPlunkett Dec 11 '13 at 07:43
  • 3
    I have done matlab to C++ translations. A typical expectation on "normal" matlab code was for the C++ to be 20x faster. – RichardPlunkett Dec 11 '13 at 07:47
  • @Richard Yeah, I ignored this aspect just to avoid too many questions. This translation provides a good insight I believe. But I tried primarily to focus 'why and when' of Matlab vs C++. – Loves Probability Dec 11 '13 at 07:51
  • @Richard The "when you know how to" part, I think is uncommon for a common programmer isn't it? I would appreciate if someone can throw some light on this point since it appears like a popular opinion. – Loves Probability Dec 11 '13 at 07:54
  • 1
    I'd mention that MATLAB has a positive in that all its libraries use fairly robust implementations, so you don't have to worry as much about numerical stability and which algorithm to select. On the other hand, a C++ library could offer all the same luxuries... – user904963 Dec 11 '13 at 08:22
  • 2
    Many of the critical parts in MATLAB are using some sort of native library (developed in-house or using 3rd party lib), and implemented in a compiled language (C/C++, Fortran). For instance the simple [backslash operator](http://stackoverflow.com/a/18553768/97160) `x = A\b` is actually a front for a dozen of possible underlying implementations. For the other parts implemented in pure MATLAB, the JIT compiler helps alleviate the cost of an interpreted language. Also MATLAB often encourages writing vectorized code (think SIMD instructions). Finally the GUI stuff is largely implemented in Java. – Amro Dec 11 '13 at 14:16

9 Answers9

129

I have been using Matlab and C++ for about 10 years. For every numerical algorithms implemented for my research, I always start from prototyping with Matlab and then translate the project to C++ to gain a 10x to 100x (I am not kidding) performance improvement. Of course, I am comparing optimized C++ code to the fully vectorized Matlab code. On average, the improvement is about 50x.

There are lot of subtleties behind both of the two programming languages, and the following are some misunderstandings:

  1. Matlab is a script language but C++ is compiled

    Matlab uses JIT compiler to translate your script to machine code, you can improve your speed at most by a factor 1.5 to 2 by using the compiler that Matlab provides.

  2. Matlab code might be able to get fully vectorized but you have to optimize your code by hand in C++

    Fully vectorized Matlab code can call libraries written in C++/C/Assembly (for example Intel MKL). But plain C++ code can be reasonably vectorized by modern compilers.

  3. Toolboxes and routines that Matlab provides should be very well tuned and should have reasonable performance

    No. Other than linear algebra routines, the performance is generally bad.

The reasons why you can gain 10x~100x performance in C++ comparing to vectorized Matlab code:

  1. Calling external libraries (MKL) in Matlab costs time.

  2. Memory in Matlab is dynamically allocated and freed. For example, small matrices multiplication:
    A = B*C + D*E + F*G
    requires Matlab to create 2 temporary matrices. And in C++, if you allocate your memory before hand, you create NONE. And now imagine you loop that statement for 1000 times. Another solution in C++ is provided by C++11 Rvalue reference. This is the one of the biggest improvement in C++, now C++ code can be as fast as plain C code.

  3. If you want to do parallel processing, Matlab model is multi-process and the C++ way is multi-thread. If you have many small tasks needing to be parallelized, C++ provides linear gain up to many threads but you might have negative performance gain in Matlab.

  4. Vectorization in C++ involves using intrinsics/assembly, and sometimes SIMD vectorization is only possible in C++.

  5. In C++, it is possible for an experienced programmer to completely avoid L2 cache miss and even L1 cache miss, hence pushing CPU to its theoretical throughput limit. Performance of Matlab can lag behind C++ by a factor of 10x due to this reason alone.

  6. In C++, computational intensive instructions sometimes can be grouped according to their latencies (code carefully in assembly or intrinsics) and dependencies (most of time is done automatically by compiler or CPU hardware), such that theoretical IPC (instructions per clock cycle) could be reached and CPU pipelines are filled.

However, development time in C++ is also a factor of 10x comparing to Matlab!

The reasons why you should use Matlab instead of C++:

  1. Data visualization. I think my career can go on without C++ but I won't be able to survive without Matlab just because it can generate beautiful plots!

  2. Low efficiency but mathematically robust build-in routines and toolboxes. Get the correct answer first and then talk about efficiency. People can make subtle mistakes in C++ (for example implicitly convert double to int) and get sort of correct results.

  3. Express your ideas and present your code to your colleagues. Matlab code is much easier to read and much shorter than C++, and Matlab code can be correctly executed without compiler. I just refuse to read other people's C++ code. I don't even use C++ GNU scientific libraries because the code quality is not guaranteed. It is dangerous for a researcher/engineer to use a C++ library as a black box and take the accuracy as granted. Even for commercial C/C++ libraries, I remember Intel compiler had a sign error in its sin() function last year and numerical accuracy problems also occurred in MKL.

  4. Debugging Matlab script with interactive console and workspace is a lot more efficient than C++ debugger. Finding an index calculation bug in Matlab could be done within minutes, but it could take hours in C++ figuring out why the program crashes randomly if boundary check is removed for the sake of speed.

Last but not the least:

Because once Matlab code is vectorized, there is not much left for a programmer to optimize, Matlab code performance is much less sensitive to the quality of the code comparing with C++ code. Therefore it is best to optimize computation algorithms in Matlab, and marginally better algorithms normally have marginally better performance in Matlab. On the other hand, algorithm test in C++ requires decent programmer to write algorithms optimized more or less in the same way, and to make sure the compiler does not optimize the algorithms differently.

My recent experience in C++ and Matlab:

I made several large Matlab data analysis tools in the past year and suffered from the slow speed of Matlab. But I was able to improve my Matlab program speed by 10x through the following techniques:

  • Run/profile the Matlab script, re-implement critical routines in C/C++ and compile with MEX. Critical routines are mostly likely logically simple but numerically heavy. This improves speed by 5x.

  • Simplify ".m" files shipped with Matlab tool boxes by commenting all unnecessary safety checks and output parameter computations. Please be reminded that the modified code cannot be distributed with the rest of the user scripts. This improves speed by another 2x (after C/C++ and MEX).

The improved code is ~98% in Matlab and ~2% in C++.

I believe it is possible to improve the speed by another 2x (total 20x) if the entire tool is coded in C++, this is ~100x speed improvement of the computation routines. The hard drive I/O will then dominate the program run time.

Question for Mathworks engineers:

When Matlab code is fully vectorized, one of the performance limiting factor is the matrix indexing operation. For instance, a finite difference operation needs to be performed on Matrix A which has a dimension of 5000x5000:

B = A(:,2:end)-A(:,1:end-1)

The matrix indexing operation makes the Matlab code multiple times slower than the C++ code. Can the matrix indexing performance be improved?

Community
  • 1
  • 1
PhD AP EcE
  • 3,751
  • 2
  • 17
  • 15
  • 10
    Thats a beautiful summary. And that made me too curious to find who is this amazing PhD guy from Cornell, but "Apparently, this user prefers to keep an air of mystery about them". For over 2 years ever since I asked this Q. Thats no problem, to respect privacy. Cheers! – Loves Probability Sep 08 '16 at 05:56
  • 2
    Thank you! I am very glad to help, and share my hard learned lessons in the past. – PhD AP EcE Sep 15 '16 at 14:31
  • 2
    Point 3.6: latest versions of Matlab do not need to copy to a 3rd matrix to swap 2 matrices due to JIT copy on write. The set of commands: c = a; a = b; b = c; clear c; will only generate references and no data copying occurs. – Wybird666 Aug 05 '18 at 23:21
  • @Wybird666: MATLAB has always had lazy copying (copy-on-write), this has nothing to do with the JIT. – Cris Luengo Mar 04 '19 at 18:43
  • Thank you Wybrid666, could you help me edit the post such that it reflects your/correct understanding? – PhD AP EcE Mar 04 '19 at 18:46
  • Regarding your new edit: yes, this is annoying. But this is not the right place to ask for changes to MATLAB, they're not going to see this. If you have a license, submit a request directly to MathWorks. You're more likely to influence their work that way. Good luck! (BTW: finite differences you can compute more efficiently using `conv`!) – Cris Luengo Mar 04 '19 at 18:47
  • Thank you for your comments Cris, really appreciate you point out the mistakes, this is really the place for me to learn and share my experience. I am a scientist and user of Matlab, and the post reflects a "user's view" of Matlab, and when its performance lags behind. We complain about the "slow engine", and it would be great if you can provide the correct reasoning by editing the post. Thank you for helping! The finite difference is just an example of matrix indexing performance issue, in real application, matrix indexing becomes much more complicated for us, and a real performance issue. – PhD AP EcE Mar 04 '19 at 18:53
  • OK, I've removed the two statements that I think are wrong. Feel free to revert my edits or modify them as you think is reasonable. I agree with the indexing, it is quite slow, and often the reason that vectorized code is slower than simpler loopy code under the JIT. -- BTW: ping me by adding `@Cris` to your post. I saw your reply here by chance. – Cris Luengo Mar 04 '19 at 19:30
  • Can you please explain how writing the 'logically simple but numerically heavy' operations in C++ and calling them with MEX helps speed up Matlab. Because this is in direct contradiction to countless questions on stackoverflow where the consensus is that Matlab is almost optimal for matrix multiplication since it uses Intel's MKL library. And thus the best C++ can hope for is to equal Matlab's performance but not surpass it. – sonicboom Jan 04 '21 at 20:55
  • See for example [https://stackoverflow.com/questions/6058139/why-is-matlab-so-fast-in-matrix-multiplication][1] or [https://stackoverflow.com/questions/46714235/performance-matlab-vs-c-matrix-vector-multiplication][2] – sonicboom Jan 04 '21 at 20:56
  • @sonicboom there are many situations calling MKL won't be a good option, for example, when you need to perform some simple calculations for a huge amount of objects: you cannot easily vectorize the operations using SIMD or MKL because data belong to different objects, and you cannot parallelize the computation using Matlab's multi-process model neither, because each "task" is too small and most of time will be spent on dispatching rather than computation. – PhD AP EcE Feb 13 '21 at 01:35
  • The answer from @DCS states "Matlab performance depends strongly (and much more than C++ performance) on your coding style.", whereas the answer here states "Matlab code performance is much less sensitive to the quality of the code comparing with C++ code." - Can someone reconcile these two opposing views? – Georgeos Hardo Dec 26 '21 at 21:03
  • @GeorgeosHardo, Once the Matlab code is vectorized, there is very little a Matlab programmer can do to further improve the performance. On the other hand, one can do crazy things in C++: (1) for single core: cache optimization, branching optimization, dynamic code binding, SIMD......(2) for multi-core: multi-threading + thread management (priority, core binding, thread local memory management, thread pooling) + NUMA + concurrent memory model (lock free for example). I think it is more accurate to say "Matlab performance depends much less on programmer's coding skill than C++". – PhD AP EcE May 19 '22 at 17:13
11

In my experience (several years of Computer Vision and image processing in both languages) there is no simple answer to this question, as Matlab performance depends strongly (and much more than C++ performance) on your coding style.

Generally, Matlab wraps the classic C++ / Fortran based linear algebra libraries. So anything like x = A\b is going to be very fast. Also, Matlab does a good job in choosing the most efficient solver for these types of problems, so for x = A\b Matlab will look at the size of your matrices and chose the appropriate low-level routines.

Matlab also shines in data manipulation of large matrices if you "vectorize" your code, i.e. if you avoid for loops and use index arrays or boolean arrays to access your data. This stuff is highly optimised.

For other routines, some are written in Matlab code, while others point to a C/C++ implementation (e.g. the Delaunay stuff). You can check this yourself by typing edit some_routine.m. This opens the code and you see whether it is all Matlab or just a wrapper for something compiled.

Matlab, I think, is primarily for comfort - but comfort translates to coding time and ultimately money which is why Matlab is used in the industry. Also, it is easy to learn for engineers from other fields than computer science, with little training in programming.

DCS
  • 3,354
  • 1
  • 24
  • 40
  • Thanks for the answer. But given all that wisdom of matlab and Fortran level of optimization, why should matlab be slower! – Loves Probability Dec 11 '13 at 08:02
  • I mean, is n't it a wasted effort to optimize at assembly level, if you don't see any improvement? Also, though I see your point, "strongly depends on your coding style" is something that needs some elaboration. I would actually request, to consider a reasonable programming efficiency in both sides, which still has this 100x range of difference. – Loves Probability Dec 11 '13 at 08:23
  • The point is: Matlab is very fast for some large operations (matrix and lin. alg. stuff, done low level) but fairly slow for everything else, where it is essentially an interpreted scripting language (with much less sophistication than other scripting languages). The problems Matlab is optimised for are those where the workload is in large matrix and linear algebra operations with little (slow) boilerplate code around. If your problem is not of that kind (i.e. if you have to code a lot with for loops and non-matrix operations) Matlab speed deteriorates. – DCS Dec 11 '13 at 08:35
  • 1
    BTW, the name Matlab comes from "Matrix Laboratory" - the product was initially cerated to provide a comfortable interface to the fast Fortran / C linear algebra routines of LAPACK and other packages. This is where the division between the fast, optimised and the scripting part of the language comes from. – DCS Dec 11 '13 at 08:43
  • I agree with DCS - its all about how you code. I see much code written in Matlab that does not make use of built in functions and that is not vectorized. Do these two things and it runs pretty damn fast. I'd be surprised if the total time saved by running code in C/C++ outperforms the total time spent writing, testing and verifying that optimized code compared to the time taken to write, test and verify well-written Matlab code! – Wybird666 Aug 05 '18 at 23:26
8

As a PhD Student too, and a 10years long Matlab user, I'm glad to share my POV:

Matlab is a great tool for developing and prototyping algorithms, especially when dealing with GUIs, high-level analysis (Frequency Domain, LS Optimization etc.): fast coding, powerful syntaxis (think about [],{},: etc.).

As soon as your processing chain is more stable and defined and data dimensions grows move to C/C++.

The main Matlab limit rises when considering its language is script-like: as long as you avoid any cycle (using arrayfun, cellfun or other matrix procedures) performances are high since the called subroutine is again in C/C++.

Fabio Veronese
  • 7,726
  • 2
  • 18
  • 27
5

Your question is difficult to answer. In general C++ is faster, but if make use of the well written algorithms of Matlab it can outperform C++. In some cases Matlab can parallelize your code which has to be done manually in many cases for C++. Mathlab can kind of export C++ code.

So my conclusion is, that you have to measure the performance of both programs to get an answer. But then you compare your two implementations and not Matlab and C++ in general.

usr1234567
  • 21,601
  • 16
  • 108
  • 128
5

Matlab does very well with linear algebra and array/matrix operations, since they seem to have been doing some extra optimizations on the underlying operations - if you want to beat Matlab there, you would need a similarly optimized BLAS/LAPACK library.

As an interpreted language, Matlab loses time whenever a Matlab function is called, due to internal overhead, which traditionally meant that Matlab loops were slow. This has been alleviated somewhat in recent years thanks to significant improvement in the JIT compiler (search for "performance" questions on Matlab on SO for examples). As a consequence of the function call overhead, all Matlab functions that have not been implemented in C/C++ behind the scenes (call edit functionName to see whether it's written in Matlab) risks being slower than a C/C++ counterpart.

Finally, Matlab attempts to be user friendly, and may do "unnecessary" input checking that can take time (due to function call overhead). For example, if you know that ismember gets sorted inputs, you can call ismembc directly (the behind-the-scene compiled function), saving quite a bit of time.

Jonas
  • 74,690
  • 10
  • 137
  • 177
4

I think you can consider the difference in four folds at least.

  1. Compiled vs Interpreted
  2. Strongly-typed vs Dynamically-typed
  3. Performance vs Fast-prototyping
  4. Special strength

For 1-3 can be easily generalized into comparison between two family of programming languages.

For 4, MATLAB is optimized for matrix operations. So if you can vectorize more code in MATLAB, the performance can be drastically boosted. Conversely, if many loops are required, never hesitate to use C++ or create a mex file.

It is a difficult quesion after all.

Ray
  • 2,472
  • 18
  • 22
3

I saw a 5.5x speed improvement when switching from MATLAB to C++. This was for a robot controller- lots of loops and ode solving. I spent many hours trying to optimize the MATLAB code, hardly any time optimizing the C++ (I'm sure it could have been 10x faster with a little more effort).

However, it was easy to add a GUI for the MATLAB code, so I still use it more often. Like others have said, it was nice to prototype first on MATLAB. That made the implementation on C++ much simpler.

AndyZe
  • 49
  • 4
2

Besides the speed of the final program, you should also take into account the total development time of your code, ie., not only the time to write, but also to debug, etc. Matlab (and its open-source counterpart, Octave) can be good for quick prototyping due to its visualisation capabilities.

If you're using straight C++ (ie. no matrix libraries), it may take you much longer to write C++ code that's equivalent to Matlab code (eg. there might be no point in spending 10 hours writing C++ code that only runs 10 seconds quicker, compared to a Matlab program that took 5 minutes to write).

However, there are dedicated C++ matrix libraries, such as Armadillo, which provide a Matlab-like API. This can be useful for writing performance critical code that can be called from Matlab, or for converting Matlab code into "real" programs.

mtall
  • 3,574
  • 15
  • 23
2

Some Matlab code uses standard linear algebra fictions with multithreading built into it. So, it appears that they are faster than a sequential C code.