Four nested for loops optimization - I promise I searched

Question

I've tried to find a good way to speed up the code for a problem I've been working on. The basic idea of the code is very simple. There are five inputs:

Four 1xm (for some m < n, they can be different sizes) matrices (A, B, C, D) that are pairwise-disjoint subsets of {1,2,...,n} and one nxn symmetric binary matrix (M). The basic idea for the code is to check an inequality for for every combination of elements and if the inequality holds, return the values that cause it to hold, i.e.:

    for a = A
      for b = B
        for c = C
          for d = D
            if M(a,c) + M(b,d) < M(a,d) + M(b,c)
              result = [a b c d];
              return
            end
          end
        end
      end
    end

I know there has to be a better way to do this. First, since it's symmetric, I can cut down half of the items checked since M(a,b) = M(b,a). I've been researching vectorization, found several functions I'd never heard of with MATLAB (since I'm relatively new), but I can't find anything that will particularly help me with this specific problem. I've thought of other ways to approach the problem, but nothing has been perfected, and I just don't know what to do at this point.

For example, I could possibly split this into two cases: 1) The right hand side is 1: then I have to check that both terms on the left side are 0. 2) The right hand side is 2: then I have to check that at least one term on the left hand side is 0.

But, again, I won't be able to avoid nesting.

I appreciate all the help you can offer. Thank you!

You can use the answers from this question to get your combination matrix http://stackoverflow.com/questions/4165859/generate-all-possible-combinations-of-the-elements-of-some-vectors-cartesian-pr — Some Guy, Sep 08 '16 at 20:18

score 1 · Answer 1 · edited May 23 '17 at 12:19

You're asking two questions here: (1) is there a more efficient algorithm to perform this search, and (2) how can I vectorize this in MATLAB. The first one is very interesting to think about, but may be a little beyond the scope of this forum. The second one is easier to answer.

As pointed out in the comments below your question, you can vectorize the for loop by enumerating all of the possibilities and checking them all together, and the answers from this question can help:

[a,b,c,d] = ndgrid(A,B,C,D);     % Enumerate all combos
a=a(:); b=b(:); c=c(:); d=d(:);  % Reshape from 4-D matrices to vectors
ac = sub2ind(size(M),a,c);       % Convert subscript pairs to linear indices
bd = sub2ind(size(M),b,d);
ad = sub2ind(size(M),a,d);
bc = sub2ind(size(M),b,c);
mask = (M(ac) + M(bd) < M(ad) + M(bc));     % Test the inequality
results = [a(mask), b(mask), c(mask), d(mask)]; % Select the ones that pass

Again, this isn't an algorithmic change: it still has the same complexity as your nested for loop. The vectorization may cause it to run faster, but it also lacks early termination, so in certain cases it may be slower.

score 0 · Answer 2 · edited Sep 26 '16 at 12:59

If you have access to the Neural Network Toolbox, combvec could be helpful here.

running allCombs = combvec(A,B,C,D) will give you a (4 by m1*m2*m3*m4) matrix that looks like:

[...
a1, a1, a1, a1, a1 ... a1... a2... am1;
b1, b1, b1, b1, b1 ... b2... b1... bm2;
c1, c1, c1, c1, c2 ... c1... c1... cm3;
d1, d2, d3, d4, d1 ... d1... d1... dm4]

You can then use sub2ind and Matrix Indexing to setup the two values you need for your inequality:

indices = [sub2ind(size(M),allCombs(1,:),allCombs(3,:));
            sub2ind(size(M),allCombs(2,:),allCombs(4,:));
            sub2ind(size(M),allCombs(1,:),allCombs(4,:));
            sub2ind(size(M),allCombs(2,:),allCombs(3,:))];

testValues = M(indices);
testValues(5,:) = (testValues(1,:) + testValues(2,:) < testValues(3,:) + testValues(4,:))

Your final a,b,c,d indices could be retrieved by saying

allCombs(:,find(testValues(5,:)))

Which would print a matrix with all columns which the inequality was true.

This article might be of some use.

KQS · Answer 3 · 2016-09-09T17:04:52.267

Since M is binary, we can think about this as a graph problem. i,j in {1..n} correspond to nodes, and M(i,j) indicates whether there is an undirected edge connecting them.

Since A,B,C,D are disjoint, that simplifies the problem a bit. We can approach the problem in stages:

Find all (c,d) for which there exists a such that M(a,c) < M(a,d). Let's call this set CD_lt_a, (the subset of C*D such that the "less than" inequality holds for some a).
Find all (c,d) for which there exists a such that M(a,c) <= M(a,d), and call this set CD_le_a.
Repeat for b, forming CD_lt_b for M(b,d) < M(b,c) and CD_le_b for M(b,d)<=M(b,c).
One way to satisfy the overall inequality is for M(a,c) < M(a,d) and M(b,d) <= M(b,c), so we can look at the intersection of CD_lt_a and CD_le_b.
The other way is if M(a,c) <= M(a,d) and M(b,d) < M(b,c), so look at the intersection of CD_le_a and CD_lt_b.
With (c,d) known, we can go back and find the (a,b).

And so my implementation is:

% 0. Some preliminaries
% Get the size of each set
mA = numel(A); mB = numel(B); mC = numel(C); mD = numel(D);

% 1. Find all (c,d) for which there exists a such that M(a,c) < M(a,d)
CA_linked = M(C,A);
AD_linked = M(A,D);
CA_not_linked = ~CA_linked;
% Multiplying these matrices tells us, for each (c,d), how many nodes
% in A satisfy this M(a,c)<M(a,d) inequality
% Ugh, we need to cast to double to use the matrix multiplication
CD_lt_a = (CA_not_linked * double(AD_linked)) > 0;     

% 2. For M(a,c) <= M(a,d), check that the converse is false for some a
AD_not_linked = ~AD_linked;
CD_le_a = (CA_linked * double(AD_not_linked)) < mA;

% 3. Repeat for b
CB_linked = M(C,B);
BD_linked = M(B,D);
CD_lt_b = (CB_linked * double(~BD_linked)) > 0;
CD_le_b = (~CB_linked * double(BD_linked)) < mB;

% 4. Find the intersection of CD_lt_a and CD_le_b - this is one way
%    to satisfy the inequality M(a,c)+M(b,d) < M(a,d)+M(b,c)
CD_satisfy_ineq_1 = CD_lt_a & CD_le_b;

% 5. The other way to satisfy the inequality is CD_le_a & CD_lt_b
CD_satisfy_ineq_2 = CD_le_a & CD_lt_b;
inequality_feasible = any(CD_satisfy_ineq_1(:) | CD_satisfy_ineq_2(:));

Note that you can stop here if feasibility is your only concern. The complexity is A*C*D + B*C*D, which is better than the worst-case A*B*C*D complexity of the for loop. However, early termination means your nested for loops may still be faster in certain cases.

The next block of code enumerates all the a,b,c,d that satisfy the inequality. It's not very well optimized (it appends to a matrix from within a loop), so it can be pretty slow if there are many results.

% 6. With (c,d) known, find a and b
% We can define these functions to help us search
find_a_lt = @(c,d) find(CA_not_linked(c,:)' & AD_linked(:,d));
find_a_le = @(c,d) find(CA_not_linked(c,:)' | AD_linked(:,d));
find_b_lt = @(c,d) find(CB_linked(c,:)' & ~BD_linked(:,d));
find_b_le = @(c,d) find(CB_linked(c,:)' | ~BD_linked(:,d));
% I'm gonna assume there aren't too many results, so I will be appending
% to an array inside of a for loop. Bad for performance, but maybe a bit
% more readable for a StackOverflow answer.
results = zeros(0,4);
% Find those that satisfy it the first way
[c_list,d_list] = find(CD_satisfy_ineq_1);
for ii = 1:numel(c_list)
    c = c_list(ii); d = d_list(ii);
    a = find_a_lt(c,d);
    b = find_b_le(c,d);
    % a,b might be vectors, in which case all combos are valid
    % Many ways to find all combos, gonna use ndgrid()
    [a,b] = ndgrid(a,b);
    % Append these to the growing list of results
    abcd = [a(:), b(:), repmat([c d],[numel(a),1])];
    results = [results; abcd];
end
% Repeat for the second way
[c_list,d_list] = find(CD_satisfy_ineq_2);
for ii = 1:numel(c_list)
    c = c_list(ii); d = d_list(ii);
    a = find_a_le(c,d);
    b = find_b_lt(c,d);
    % a,b might be vectors, in which case all combos are valid
    % Many ways to find all combos, gonna use ndgrid()
    [a,b] = ndgrid(a,b);
    % Append these to the growing list of results
    abcd = [a(:), b(:), repmat([c d],[numel(a),1])];
    results = [results; abcd];
end
% Remove duplicates
results = unique(results, 'rows');
% And actually these a,b,c,d will be indices into A,B,C,D because they
% were obtained from calling find() on submatrices of M.
if ~isempty(results)
    results(:,1) = A(results(:,1));
    results(:,2) = B(results(:,2));
    results(:,3) = C(results(:,3));
    results(:,4) = D(results(:,4));
end

I tested this on the following test case:

m = 1000;
A = (1:m); B = A(end)+(1:m); C = B(end)+(1:m); D = C(end)+(1:m);
M = rand(D(end),D(end)) < 1e-6; M = M | M';

I like to think that first part (see if the inequality is feasible for any a,b,c,d) worked pretty well. The other vectorized answers (that use ndgrid or combvec to enumerate all combinations of a,b,c,d) would require 8 terabytes of memory for a problem of this size!

But I would not recommend running the second part (enumerating all of the results) when there are more than a few hundred c,d that satisfy the inequality, because it will be pretty damn slow.

P.S. I know I answered already, but that answer was about vectorizing such loops in general, and is less specific to your particular problem.

P.P.S. This kinda reminds me of the stable marriage problem. Perhaps some of those references would contain algorithms relevant to your problem as well. I suspect that a true graph-based algorithm could probably achieve the worst-case complexity as this while additionally offering early termination. But I think it would be difficult to implement a graph-based algorithm efficiently in MATLAB.

P.P.P.S. If you only want one of the feasible solutions, you can simplify step 6 to only return a single value, e.g.

find_a_lt = @(c,d) find(CA_not_linked(c,:)' & AD_linked(:,d), 1, 'first');
find_a_le = @(c,d) find(CA_not_linked(c,:)' | AD_linked(:,d), 1, 'first');
find_b_lt = @(c,d) find(CB_linked(c,:)' & ~BD_linked(:,d), 1, 'first');
find_b_le = @(c,d) find(CB_linked(c,:)' | ~BD_linked(:,d), 1, 'first');
if any(CD_satisfy_ineq_1)
    [c,d] = find(CD_satisfy_ineq_1, 1, 'first');
    a = find_a_lt(c,d);
    b = find_a_le(c,d);
    result = [A(a), B(b), C(c), D(d)];
elseif any(CD_satisfy_ineq_2)
    [c,d] = find(CD_satisfy_ineq_2, 1, 'first');
    a = find_a_le(c,d);
    b = find_a_lt(c,d);
    result = [A(a), B(b), C(c), D(d)];
else
    result = zeros(0,4);
end

Thank you for this very in-depth response! Early termination is what I'm aiming for here, so I don't need all possible sets that satisfy the inequality. What language would you recommend trying to program this in instead of MATLAB? — spane, Sep 09 '16 at 16:53
I added an example of how to return only a single solution. If the problem is small enough that you can represent `M` as a full matrix, I think this method (based on matrix multiplication of the adjacency matrix) really isn't so bad. If you really want to try another language, I'd recommend some kind of object-oriented compiled language like C++. But it also depends on what you're already familiar with and/or are interested in learning. I've also heard good things about Julia, which is more like MATLAB, but is compiled. — KQS, Sep 09 '16 at 17:20
Thank you very much! Would you mind explaining why double had to be used with the matrix multiplication? I've looked up the literature on it and don't quite understand why it's necessary in this case. — spane, Sep 11 '16 at 19:00
For things like matrix multiplication, MATLAB relies on numerical linear algebra libraries that are written to operate on floating point numbers. While it's perfectly well-defined to multiply matrices of integers or booleans, it just so happens that nobody has written the code to do so. So if you ask MATLAB to multiply two matrices of logicals, it says that it can't do it. — KQS, Sep 12 '16 at 02:16

Four nested for loops optimization - I promise I searched

3 Answers3