I would like to construct a function
[B, ind] = extract_ones(A)
which removes some sub-arrays from a binary array A in arbitrary dimensions, such that the remaining array B is the largest possible array with only 1's, and I also would like to record in ind that where each of the 1's in B comes from.
Example 1
Assume A is a 2-D array as shown
A =
1 1 0 0 0 1
1 1 1 0 1 1
0 0 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 1 1 1 1
After removing A(3,:) and A(:,3:5), we have the output B
B =
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
which is the largest array with only ones by removing rows and columns of A. As the fifteen 1's of B corresponds to
A(1,1) A(1,2) A(1,6)
A(2,1) A(2,2) A(2,6)
A(4,1) A(4,2) A(4,6)
A(5,1) A(5,2) A(5,6)
A(6,1) A(6,2) A(6,6)
respectively, or equivalently
A(1) A(7) A(31)
A(2) A(8) A(32)
A(4) A(10) A(34)
A(5) A(11) A(35)
A(6) A(12) A(36)
so, the output ind looks like (of course ind's shape does not matter):
ind = [1 2 4 5 6 7 8 10 11 12 31 32 34 35 36]
Example 2
If the input A is constructed by
A = ones(6,3,4,3);
A(2,2,2,2) = 0;
A(4,1,3,3) = 0;
A(1,1,4,2) = 0;
A(1,1,4,1) = 0;
Then, by deleting the minimum cuboids containing A(2,2,2,2), A(4,1,3,3), A(1,1,4,3) and A(1,1,4,1), i.e. after deleting these entries
A(2,:,:,:)
A(:,1,:,:)
Then the remaining array B will be composed by 1's only. And the ones in B corresponds to
A([1,3:6],2:3,1:4,1:3)
So, the output ind lists the subscripts transformed into indices, i.e.
ind = [7 9 10 11 12 13 15 16 17 18 25 27 28 29 30 31 33 34 35 36 43 45 46 47 48 49 51 52 53 54 61 63 64 65 66 67 69 70 71 72 79 81 82 83 84 85 87 88 89 90 97 99 100 101 102 103 105 106 107 108 115 117 118 119 120 121 123 124 125 126 133 135 136 137 138 139 141 142 143 144 151 153 154 155 156 157 159 160 161 162 169 171 172 173 174 175 177 178 179 180 187 189 190 191 192 193 195 196 197 198 205 207 208 209 210 211 213 214 215 216]
As the array needed to be processed as above is in 8-D, and it should be processed more than once, so can anyone give me opinions on how to composing the program doing this task fast?
My work so far [Added at 2 am (GMT-4), 2nd Aug 2017]
My idea was that I delete the sub-arrays with the largest proportion of zero one by one. And here is my work so far:
Inds = reshape(1:numel(A),size(A)); % Keep track on which 1's survive.
cont = true;
while cont
sz = size(A);
zero_percentage = 0;
Test_location = [];
% This nested for loops are for determining which sub-array of A has the
% maximum proportion of zeros.
for J = 1 : ndims(A)
for K = 1 : sz(J)
% Location is in the form of (_,_,_,...,_)
% where the J-th blank is K, the other blanks are colons.
Location = strcat('(',repmat(':,',1,(J-1)),int2str(K),repmat(',:',1,(ndims(A)-J)),')');
Test_array = eval(strcat('A',Location,';'));
N = numel(Test_array);
while numel(Test_array) ~= 1
Test_array = sum(Test_array);
end
test_zero_percentage = 1 - (Test_array/N);
if test_zero_percentage > zero_percentage
zero_percentage = test_zero_percentage;
Test_location = Location;
end
end
end
% Delete the array with maximum proportion of zeros
eval(strcat('A',Test_location,'= [];'))
eval(strcat('Inds',Test_location,'= [];'))
% Determine if there are still zeros in A. If there are, continue the while loop.
cont = A;
while numel(cont) ~= 1
cont = prod(cont);
end
cont = ~logical(cont);
end
But I encountered two problems:
1) It may be not efficient to check all arrays in all sub-dimensions one-by-one.
2) The result does not contain the most number of rectangular ones. for example, I tested my work using a 2-dimensional binary array A
A =
0 0 0 1 1 0
0 1 1 0 1 1
1 0 1 1 1 1
1 0 0 1 1 1
0 1 1 0 1 1
0 1 0 0 1 1
1 0 0 0 1 1
1 0 0 0 0 0
It should return me the result as
B = 1 1 1 1 1 1 1 1 1 1 1 1 Inds = 34 42 35 43 36 44 37 45 38 46 39 47
But, instead, the code returned me this:
B = 1 1 1 1 1 1 1 1 1 Inds = 10 34 42 13 37 45 14 38 46
*My work so far 2 [Added at 12noon (GMT-4), 2nd Aug 2017]
Here is my current amendment. This may not provide the best result. This may give a fairly OK approximation to the problem, and this does not give empty Inds. But I am still hoping that there is a better solution.
function [B, Inds] = Finding_ones(A)
Inds = reshape(1:numel(A),size(A)); % Keep track on which 1's survive.
sz0 = size(A);
cont = true;
while cont
sz = size(A);
zero_percentage = 0;
Test_location = [];
% This nested for loops are for determining which sub-array of A has the
% maximum proportion of zeros.
for J = 1 : ndims(A)
for K = 1 : sz(J)
% Location is in the form of (_,_,_,...,_)
% where the J-th blank is K, the other blanks are colons.
Location = strcat('(',repmat(':,',1,(J-1)),int2str(K),repmat(',:',1,(ndims(A)-J)),')');
Test_array = eval(strcat('A',Location,';'));
N = numel(Test_array);
Test_array = sum(Test_array(:));
test_zero_percentage = 1 - (Test_array/N);
if test_zero_percentage > zero_percentage
eval(strcat('Testfornumel = numel(A',Location,');'))
if Testfornumel < numel(A) % Preventing the A from being empty
zero_percentage = test_zero_percentage;
Test_location = Location;
end
end
end
end
% Delete the array with maximum proportion of zeros
eval(strcat('A',Test_location,'= [];'))
eval(strcat('Inds',Test_location,'= [];'))
% Determine if there are still zeros in A. If there are, continue the while loop.
cont = A;
while numel(cont) ~= 1
cont = prod(cont);
end
cont = ~logical(cont);
end
B = A;
% command = 'i1, i2, ... ,in'
% here, n is the number of dimansion of A.
command = 'i1';
for J = 2 : length(sz0)
command = strcat(command,',i',int2str(J));
end
Inds = reshape(Inds,numel(Inds),1); %#ok<NASGU>
eval(strcat('[',command,'] = ind2sub(sz0,Inds);'))
% Reform Inds into a 2-D matrix, which each column indicate the location of
% the 1 originated from A.
Inds = squeeze(eval(strcat('[',command,']')));
Inds = reshape(Inds',length(sz0),numel(Inds)/length(sz0));
end