0

I am very new to Matlab, and I feel completely overwhelmed by the use of arrays. What is the most efficient implementation of the following C++ code in Matlab?

A = std::vector<double>();

for (int i = 0; i < 100; i++) {
  if (complicatedBoolFunction(i)) {
    A.push_back(i);
  }
}

Edit: By efficiency I mean to use as little resources as possible to grow the array A - that is, to avoid copy-pasting it into temporary memory

Aleksejs Fomins
  • 688
  • 1
  • 8
  • 20

2 Answers2

5

You can do this 2 ways

  1. Pre-allocating for the maximum size, and removing unused elements. This has the advantage of pre-allocating memory in case the condition is often met...

    A = NaN(100,1)
    for ii = 0:99
        if rand > 0.5    % some condition
            A(ii+1) = ii; % some value
        end
    end
    A(isnan(A)) = []; % remove unused elements
    
  2. Appending to the array. This avoids making A way too large if appending is unlikely...

    A = []; % empty array
    for ii = 0:99
        if rand > 0.5 % some condition
            A(end+1, 1) = ii; % some value. Equivalent to 'A = [A; ii];'
        end
    end
    

A better, and more Matlab-esque way of doing this would be to vectorise your conditional function. This way you avoid looping and allocation issues...

ii = 0:99;
A = ii(rand(100, 1) > 0.5);

You can use any Boolean function you like as an indexing array, as long as it returns a logical array with the same number of elements as the array you're indexing (ii here) or integer indices of the elements to choose.

Wolfie
  • 27,562
  • 7
  • 28
  • 55
  • I like the third, **Matlab-esque** way, it's not always practical but it seems the most efficient way - no looping – user2305193 Feb 19 '18 at 14:48
  • @user2305193: Loops are no longer slow in MATLAB, I've noticed sometimes code slows down after vectorizing a loop! See for example here: https://stackoverflow.com/a/48220174/7328782 – Cris Luengo Feb 19 '18 at 17:27
  • Agreed @Cris, not *necessarily* slow, although for this example it is clearly neater if the Boolean function call can be vectorised. Whilst the newer JIT compiler is pretty decent with loops, vectorising code can still yield decent performance improvements for some operations. – Wolfie Feb 19 '18 at 17:40
  • 1
    I agree. And oftentimes (as in this case) vectorized code is easier to read too. But some vectorization optimizations are really hard to decipher! :) – Cris Luengo Feb 19 '18 at 17:43
  • @CrisLuengo I was aware loops aren't punitatively slow in Matlab and I couldn't agree more, it's more practical to use loops especially with iteratively increased variables/pushing values to vectors in terms of higher readability and being able to modify it faster on-the-fly. But Matlab seems to be layed out for the 'boolean-way', it would be surprising if it wouldn't handle these faster than (or at least equally fast as) 'conventional' loops – user2305193 Feb 20 '18 at 01:38
1

The most efficient implementation of such C++ code would be

i = 0:99;
A = i(complicatedBoolFunction(i));

Anyway you can grow an array with concatenation, which is (or was) usually not recommended, like the following

A = [];

for i = 0:99
  if (complicatedBoolFunction(i))
    A = [A i];
  end
end

or much more efficiently like this:

A = [];

for i = 0:99
  if (complicatedBoolFunction(i))
    A(end + 1) = i;
  end
end
Marco
  • 2,007
  • 17
  • 28
  • Thanks for reply, Marko. The last two suggestions work well. The first suggestion, with a simple example of i = 0:99; A = i(mod(i, 2)) fails with **Subscript indices must either be real positive integers or logicals** – Aleksejs Fomins Feb 19 '18 at 14:17
  • 1
    Because the return of `mod(i, 2)` is an array of `double` values. Try with `mod(i, 2) == 0` – Marco Feb 19 '18 at 14:19
  • 2
    The second form to grow the array is **much** more efficient, as the first form copies the array at every iteration. See here: https://stackoverflow.com/q/48351041/7328782 (compare graphs in the question and the answer). – Cris Luengo Feb 19 '18 at 17:36