1

I have a matrix in MATLAB, and I want to determine which row contains the most zeros. That is, I want to find the sparsest row (and column) in the matrix. Any way to do this more efficiently than looping with mat(sum(mat==0,i)==i,:)? Or is this the preferred method?

I'm using this for an implementation of an LU-factorization by using "minimum-degree ordering heuristic" to solve a linear system.

Adriaan
  • 17,741
  • 7
  • 42
  • 75

2 Answers2

7

To find the "sparsest" row if I'm interpreting this correctly, that means you want to find the row with the most number of zeroes. You can use sum in a vectorized fashion combined with max to figure that out:

[~, row] = max(sum(mat == 0, 2));

mat == 0 will convert your entire matrix into true/false such that true is a zero value and false otherwise, then we use sum and sum over each row individually. This will reduce the problem such that each element counts up how many zero elements we encountered per row. Using the second output of max on this result, row will thus contain whichever row had the largest amount of zeroes. We ignore the first output as this will output the actual largest value which we don't care about given your problem.

If you're concerned about speed, you can perform a matrix-vector multiplication with the converted true/false matrix with a vector of ones. This way it will perform the sum for you and thus the matrix gets promoted to double precision:

[~, row] = max((mat == 0)*ones(size(mat, 2), 1)));

Minor Note

Take note that if there are multiple rows that share the same number of the largest number of zeroes found, the limitation with this approach is that it will return the first row that satisfies this condition. In terms of your objective, I believe this should be sufficient. Also be aware that if your matrix contains no zeroes, the output of this approach will default to the first row. There are some corner cases that I didn't take into account, but I'm not sure to what extent you would need to check for these so I'll leave those out for the sake of simplicity.

rayryeng
  • 102,964
  • 22
  • 184
  • 193
2

I do not know of a function that can do this row-wise in a vectorised manner (that is, until I saw rayryeng's ingenious solution), but looping over nnz should be rather fast. Loops are no longer very slow, since the new engine came out with R2016a.

tmp = zeros(size(mat,1),1);
for ii = 1:size(mat,1)
    tmp(ii) = nnz(mat(ii,:));
end

As a side note: I'd argue against using i as a variable, as it denotes the imaginary unit.

Adriaan
  • 17,741
  • 7
  • 42
  • 75
  • 1
    This approach does have merit, especially with the new JIT as of 2016a. You have my vote. – rayryeng Nov 06 '18 at 19:48
  • I hadn't considered `nnz`, nice. Though I'd argue that using `i` and `j` aren't a big deal. I acknowledge this is my opinion. See https://stackoverflow.com/a/14861015/8239061 – SecretAgentMan Nov 07 '18 at 20:02
  • @SecretAgentMan I have noticed that with recent versions of MATLAB, the supposed speed slowdown using i and j as variables is practically nonexistent. I have been using these as variables as of late. – rayryeng Nov 07 '18 at 23:20
  • 1
    @rayryeng, I haven't tested it but I use `i` and `j` as variables (indexes) myself. I haven't found the need move away from that in my applications. There's at least two of us then. – SecretAgentMan Nov 08 '18 at 02:13