1

I have a 3D matrix called mat. Every column may or may not comprise a variable number of leading zeros. I need to replace them with NaNs. It's important to recognize that there might follow even more zeros in any column after the occurence of the first non-zero elements. That is, just indexing ALL zeros in the matrix and replacing them with NaN won't lead to the correct result.

I do have a working solution. However, it contains two for-loops. I am wondering whether it's possible to vectorize and get rid of the loop. In reality, mat could be very big, something like 10000x15x10000. Therefore, I am quite sensitive to execution speed.

Here's my toy example:

% Create test matrix
mat = randi(100,20,5,2);
mat(1:5,1,1) = 0;
mat(1:7,2,1) = 0;
mat(1:3,4,1) = 0;
mat(1:10,5,1) = 0;
mat(1:2,1,2) = 0;
mat(1:3,3,2) = 0;
mat(1:7,4,2) = 0;
mat(1:4,5,2) = 0;

% Find first non-zero element in every column
[~, firstNonZero] = max( mat ~= 0 );

% Replace leading zeros with NaN
% How to vectorize this part???
[nRows, nCols, nPlanes] = size(mat);
for j = 1 : nPlanes

   for i = 1 : nCols

       mat(1:firstNonZero(1, i, j)-1, i, j) = NaN;

   end

end
Andi
  • 3,196
  • 2
  • 24
  • 44

1 Answers1

6

You could use cumsum to create a cumulative sum down each column, then all leading zeros have a cumulative sum of zero whilst all intermediate zeros have a cumulative sum greater than zero...

mat( cumsum(mat,1) == 0 ) = NaN;

As suggested in the comments, if your mat has negative values then there's a chance the cumulative sum will be 0 later on... use the sum of absolute values instead

mat( cumsum(abs(mat),1) == 0 ) = NaN;

Note that by default, cumsum operates along the first non-singleton dimension, you can use the optional dim argument to specify the dimension. I've used dim=1 to enforce column-wise operation in case your mat could be of height 1, but this is the default for any matrix with height greater than 1.

Note this uses == for comparison, you may want to read Why is 24.0000 not equal to 24.0000 in MATLAB? and use a threshold for your equality comparison.

Wolfie
  • 27,562
  • 7
  • 28
  • 55
  • 5
    Clever solution! The only issue I see with this is if `mat` contains negative values too. There will be a chance (very small, but still) that a later 0 has a `cumsum` of 0 as well. I suppose one could do `cumsum(abs(mat))`, but we might be able to think of a better way. – Floris Sep 05 '18 at 08:25
  • @Wolfie That's a very neat one-liner. I am impressed! – Andi Sep 05 '18 at 09:00
  • @Andi, depending how large the sum gets / how many leading zeros you have, I assume performance may vary with any method. In this case, `cumsum` is potentially doing a lot of adding up which it doesn't need to. Hopefully still fairly performant though! – Wolfie Sep 05 '18 at 09:03