How to Vectorize Dependent For-Loops

Question

I'm working on a function that takes a 1xn vector x as input and returns a nxn matrix L.
I'd like to speed things up by vectorizing the loops, but there's a catch that puzzles me: loop index b depends on loop index a. Any help would be appreciated.

x = x(:); 
n = length(x);
L = zeros(n, n);
for a = 1 : n,
    for b = 1 : a-1,
        c = b+1 : a-1;
        if all(x(c)' < x(b) + (x(a) - x(b)) * ((b - c)/(b-a))),
            L(a,b) = 1;
        end
    end
end

A side note for variables in Matlab: [http://stackoverflow.com/questions/14790740/using-i-and-j-as-variables-in-matlab](http://stackoverflow.com/questions/14790740/using-i-and-j-as-variables-in-matlab) — David K, Aug 27 '13 at 18:26
Thanks for the hint. I've renamed ``i``,``j``,``k`` to ``a``,``b``,``c``. — bluebox, Aug 27 '13 at 18:36
I'm not sure if you realize this, but all([]) evaluates to true. So, for example, when a == 2, b will be 1, and c will be []. This will end up making L(a,b) 1. You may need to think through your b and c ranges (I may be wrong; maybe this is how you intended it to work). — ioums, Aug 27 '13 at 19:28
@BasSwinckels Imagine ``x`` as a bar graph. If the top of bar ``a`` can be connected with the top of bar ``b`` using a straight line (without intersecting any of the bars in between them), the code puts a '1' in ``L(a,b)``. — bluebox, Aug 27 '13 at 21:33

Bas Swinckels · Accepted Answer · 2013-08-28T10:17:04.120

1

From a quick test, it looks like you are doing something with the lower triangle only. You might be able to vectorize using ugly tricks like ind2sub and arrayfun similar to this

tril_lin_idx = find(tril(ones(n), -1));
[A, B] = ind2sub([n,n], tril_lin_idx);
C = arrayfun(@(a,b) b+1 : a-1, A, B, 'uniformoutput', false); %cell array
f = @(a,b,c) all(x(c{:})' < x(b) + (x(a) - x(b)) * ((b - c{:})/(b-a)));
L = zeros(n, n);
L(tril_lin_idx) = arrayfun(f, A, B, C);

I cannot test it, since I do not have x and I don't know the expected result. I normally like vectorized solutions, but this is maybe pushing it a bit too much :). I would stick to your explicit for-loop, which might be much clearer and which Matlab's JIT should be able to speed up easily. You could replace the if with L(a,b) = all(...).

Edit1

Updated version, to prevent wasting ~ n^3 space on C:

tril_lin_idx = find(tril(ones(n), -1));
[A, B] = ind2sub([n,n], tril_lin_idx);
c = @(a,b) b+1 : a-1;
f = @(a,b) all(x(c(a, b))' < x(b) + (x(a) - x(b)) * ((b - c(a, b))/(b-a)));
L = zeros(n, n);
L(tril_lin_idx) = arrayfun(f, A, B);

Edit2

Slight variant, which does not use ind2sub and which should be more easy to modify in case b would depend in a more complex way on a. I inlined c for speed, it seems that especially calling the function handles is expensive.

[A,B] = ndgrid(1:n);
v = B<A; % which elements to evaluate
f = @(a,b) all(x(b+1:a-1)' < x(b) + (x(a) - x(b)) * ((b - (b+1:a-1))/(b-a)));
L = false(n);
L(v) = arrayfun(f, A(v), B(v));

edited Aug 28 '13 at 10:17

answered Aug 27 '13 at 21:45

Bas Swinckels

18,095
3
45
62

Thanks, your vectorized code returns the expected results - however, (to my surprise) it's not nearly as fast as the for loop version. Btw: You're right, I only loop over one matrix triangle - the other triangle will be created like this: ``L = tril(L,-1)+tril(L,-1)'`` – bluebox Aug 27 '13 at 22:05
Matlab's JIT is very good at optimizing 'fortran-style' for-loops, that is hard to beat. A common complaint is also that arrayfun is a bit slow ... – Bas Swinckels Aug 27 '13 at 22:12
1

Another problem with the vectorized version is that you have to pre-compute `A`, `B` and `C`. Especially the latter might take up a lot of useless memory if `x` is big. Its size probably scales with `n^3`, ouch. Your for loop does not suffer from this problem. To solve this problem, you might try to compute `c` on each iteration, I will edit the answer ... – Bas Swinckels Aug 27 '13 at 22:22
I just tried replacing the ``if`` with ``L(a,b) = all(...)``. Funnily enough this also slows down the runtime by approx. 50 %. – bluebox Aug 28 '13 at 08:30
That is weird, no idea why the `if` is faster. Try without the `if` and define `L = false(n)` to prevent the bool to double conversion. Please accept the answer when you are happy, I am done playing with this problem. – Bas Swinckels Aug 28 '13 at 10:25

dlimpid · Answer 2 · 2013-08-28T10:56:16.900

1

If I understand your problem correctly, L(a, b) == 1 if for any c with a < c < b, (c, x(c)) is “below” the line connecting (a, x(a)) and (b, x(b)), right?

It is not a vectorization, but I found the other approach. Rather than comparing all c with a < c < b for each new b, I saved the maximum slope from a to c in (a, b), and used it for (a, b + 1). (I tried with only one direction, but I think that using both directions is also possible.)

x = x(:);
n = length(x);
L = zeros(n);

for a = 1:(n - 1)
  L(a, a + 1) = 1;
  maxSlope = x(a + 1) - x(a);
  for b = (a + 2):n
    currSlope = (x(b) - x(a)) / (b - a);
    if currSlope > maxSlope
      maxSlope = currSlope;
      L(a, b) = 1;
    end
  end
end

I don't know your data, but with some random data, the result is the same with original code (with transpose).

edited Aug 28 '13 at 10:56

answered Aug 28 '13 at 10:23

dlimpid

189
3
5

Nice work, on my laptop (with Matlab R2012b) it's about twice as fast as the original code. – bluebox Aug 28 '13 at 10:45
Really? Maybe I misplaced `tic;toc` when I checked runtime. – dlimpid Aug 28 '13 at 10:54
For ``n=100`` my code usually takes about 0.04 seconds, while yours is done in about 0.02 seconds. – bluebox Aug 28 '13 at 11:14

RyanD · Answer 3 · 2018-02-10T16:44:14.007

1

An esoteric answer: You could do the calculations for every a,b,c from 1:n, exclude the don't cares, and then do the all along the c dimension.

[a, b, c] = ndgrid(1:n, 1:n, 1:n);

La = x(c)' < x(b) + (x(a) - x(b)) .* ((b - c)./(b-a));
La(b >= a | c <= b | c >= a) = true;

L = all(La, 3);

Though the jit would probably do just fine with the for loops since they do very little.

Edit: still uses all of the memory, but with less maths

[A, B, C] = ndgrid(1:n, 1:n, 1:n);

valid = B < A & C > B & C < A;
a = A(valid); b = B(valid); c = C(valid);

La = true(size(A));
La(valid) = x(c)' < x(b) + (x(a) - x(b)) .* ((b - c)./(b-a));
L = all(La, 3);

Edit2: alternate last line to add the clause that c of no elements is true

L = all(La,3) | ~any(valid,3);

edited Feb 10 '18 at 16:44

answered Aug 29 '13 at 03:42

RyanD

116
1
4

Good job! It takes about 3 times as long as the loop-version on my machine...I guess the guys at Mathworks really have done a lot to improve the speed of loops in Matlab. Otherwise, I can't explain to myself why all the great vectorized versions seem to be slower... – bluebox Aug 29 '13 at 08:08
One reason this version might be slow is that `a`, `b`, `c` and `La` all take `O(n^3)` space, while the for-loop only uses `O(n^2)` space. – Bas Swinckels Sep 02 '13 at 16:39

How to Vectorize Dependent For-Loops

3 Answers3