24

I've read this but I still don't understand why vectorized code is faster.

In for loops, I can use parfor to for parallel computation. If vectorized code is faster, does it means that it is automatically parallelized?

Community
  • 1
  • 1
elwc
  • 1,197
  • 2
  • 15
  • 25
  • The answers given below are correct, but in fact I believe that big matrix operations may actually indeed be run parallel automatically. – Dennis Jaheruddin Dec 28 '12 at 10:00

3 Answers3

20

No. You're mixing two important concepts:

  • MATLAB is designed to perform vector operations really quickly. MATLAB is an interpreted language, which is why loops are so slow in it. MATLAB sidesteps this issue by providing extremely fast (usually written in C, and optimized for the specific architecture) and well tested functions to operate on vectors. There is really no magic here, it is just a bunch of hard work and many years of constant small improvements.

Consider for example a trivial case such as the following:

s=0;
for i=1:length(v),
    s = s+v(i);
end

and

sum(v)

you should probably use tic and toc to time these two functions to convince yourself of the difference in runtime. There are about 10 similar commonly used functions that operate on vectors, examples are: bsxfun, repmat, length, find. Vectorization is a standard part of using MATLAB effectively. Until you can vectorize code effectively you're just a tourist in the world of MATLAB not a citizen.

  • Recent versions of MATLAB provide parfor. parfor is not a silver bullet, it is a tool that can be used and misused (try parfor on the sum example above). Not all fors can be parfored. parfor is designed for task-parallel types of problems where each iteration of the loop is independent of each other iteration. This is a key requirement for using a parfor-loop.

While in many cases parfor can help a lot the type of loops that can be parfored for very large gains occur seldomly.

Ray
  • 2,472
  • 18
  • 22
carlosdc
  • 12,022
  • 4
  • 45
  • 62
  • 4
    You missed one important thing - **matlab is an interpreted language**. It's why loops are so slow in it. Especially compared to highly-optimized matrix operations matlab was designув for. – Leonid Beschastny Dec 26 '12 at 08:01
  • Shai: thanks. @Leonid Beschastny: I've edited my answer to reflect why looping in matlab is so slow. Thanks. – carlosdc Dec 26 '12 at 08:51
  • 2
    You may also want to add that the reason loops are slow in an interpreted language is that each function call comes with a bit of an overhead. Thus, when you call a function once with an array of a million elements, you're much faster than calling it a million times with a scalar. – Jonas Dec 26 '12 at 14:38
  • 1
    Just to note that repmat is only faster when the arrays are large. forloops are still better in simple problems. True to forloops as well. parforloops only works well when the problem requires much computation. Else parforloops will spend too much inefficient time allocating the job to other cores. – elwc Dec 26 '12 at 19:07
  • 1
    Also, Matlab has a JIT compiler built in now that closes the gap between for-loop and vectorized code considerably, as my own answer shows. – KlausCPH Jan 03 '13 at 07:36
  • "Until you can vectorize code effectively you're just a tourist in the world of MATLAB not a citizen": very well put :-) – Luis Mendo Jun 25 '15 at 15:08
  • Perhaps it should be added that some _vectorized_ functions are also [_multithreaded_](http://www.mathworks.com/matlabcentral/answers/95958-which-matlab-functions-benefit-from-multithreaded-computation), so they combine both types of gains – Luis Mendo Jun 25 '15 at 15:50
16

I agree with carlosdc on his answer. However, it is important to remember that Matlab since release 6.5 has included a JIT compiler for speeding up for-loops and the like.

I made a quick test of your sum example with a million elements in v and got the following results:

  • sum(v): 4.3 ms
  • for-loop version : 16 ms
  • for-loop version, no JIT : 966 ms

The JIT can be turned on and off like this:

feature accel off
feature accel on

A factor 4 in improvement by vectorizing code is of course still often worth it, but the for-loops shouldn't be feared as they once were for problems where they are otherwise a good solution. Often though, a piece of well vectorized code can often be simpler, less error prone and faster at the same time.

KlausCPH
  • 1,816
  • 11
  • 14
2

In modern computers, the registers (temporary memory used for math, among other uses) have many bits and can manipulate multiple numbers together. For example if your data is uint8 (8 bits), you can add a number to each one in one CPU-clock, or you can put 8 of them together in the register and and a number to all of them in one CPU-clock. This way you work 8 times faster than for-loop.

This is in a sense parallelization, but not like parfor. Parfor uses multiple cores of your CPU, and in the above method one core is used more efficiently. If you use them both, you can achieve even higher speeds.

Yanai Ankri
  • 439
  • 4
  • 11