Find locations of "TRUE" clusters in a vector of logicals

Question

I have the following vector of logicals:

vect1 = [0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1]

I would like to locate all the 1 value "clusters" within this vector along with their starting and ending indices. For output, I would like to be able to come up with something like:

where the first number is the "starting" index of each cluster and the second number is the "ending" index of each cluster.

EDIT: I was able to get this to work with a modified version of Shai's answer:

pv = [vect1 0];
sv = [0 pv(1:(end-1))];
ev = [pv(2:end) 0];
starting = find( pv - sv == 1 )
ending = find( pv - ev == 1 )

Anything wrong with my working answer? Also using `diff` is going to be much more efficient if your vectors are large. — horchler, Aug 11 '13 at 20:54
Interesting, I haven't tested larger vectors (which are in fact what I will be using this on). I will have to test those and will let you know. Thanks again! — Mr.Kinn, Aug 11 '13 at 20:56
I created a vector with 100,000 elements (`vect1 = randi([0 1], 1, 100000)`), and then ran both solutions in a for loop with 1,000 iterations. Your updated answer took 2.3 seconds to execute; Shai's took 1.6 (on repeated tries). 100,000 is about the size of my vectors. Am I still missing something? Thank you once again — Mr.Kinn, Aug 11 '13 at 21:10
Same test (with code that outputs a matrix as in your question): 2.1415 and 2.1419. These things are version- and OS- and hardware-dependent. You also must time properly or you're measuring the wrong thing -variables must be cleared between calling each version and you should "warm up" by calling the code before timing it. More important: use the code that works for you and that you understand. — horchler, Aug 11 '13 at 21:16
Could you test the speed of my version posted below too? It avoids some of the copying done in the other solutions and only uses a single find. — Bas Swinckels, Aug 11 '13 at 22:40

score 4 · Answer 1 · edited May 23 '17 at 10:25

4

This question is nearly a duplicate of this one. Adapting my answer from there:

vect1 = [0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1];
v1 = (vect1(:)==1);
d = diff(v1);
output = [find([v1(1);d]==1) find([d;-v1(end)]==-1)]

which returns

The two calls to find can be reduced to one with

[output,~] = find([[v1(1);d] [-d;v1(end)]]==1);
output = reshape(output,[length(output)/2 2]);

edited May 23 '17 at 10:25

Community

1
1

answered Aug 11 '13 at 20:37

horchler

18,384
4
37
73

Thank you for your answer! This also works, but Shai's solution is significantly faster (I need to run this code many times). – Mr.Kinn Aug 11 '13 at 20:54
Significantly?? I just tested and actually couldn't tell the difference. In any case, `diff` is supposed to be faster (multi-core optimized, etc.), but maybe for simple cases the JIT in recent versions can do a good job. It must be smart enough to figure out that two vector-vector subtractions are identical. – horchler Aug 11 '13 at 21:06

score 3 · Accepted Answer · edited Aug 12 '13 at 01:25

3

To handle the last 1, it would be simpler to pad the vector with a zero:

pv = [vect1 0];
sv = [0 pv];
ev = [pv(2:end) 0];
starting = find( pv - sv == 1 );
ending = find( pv - sv == -1 );

edited Aug 12 '13 at 01:25

grantnz

7,322
1
31
38

answered Aug 11 '13 at 20:26

Shai

111,146
38
238
371

The code gave me a `Matrix dimensions must agree.` error, so I made some modifications and posted them above. Please let me know if you agree with those. – Mr.Kinn Aug 11 '13 at 20:53

score 1 · Answer 3 · answered Aug 12 '13 at 01:45

1

This is the simplest one liner I could think of

out =  [find(diff([0 vect1 0])==1); find(diff([0 vect1 0])==-1)-1]'

answered Aug 12 '13 at 01:45

twerdster

4,977
3
40
70

score 1 · Answer 4 · answered Aug 12 '13 at 01:57

There is a run length encoding function on the Matlab file exchange that I use for this sort of problem. The benefit of this solution (i.e. the rle function) is that it finds repeated blocks without prior knowledge of which values will be repeated.

encoded = rle(vect1);
summed = cumsum(encoded{2});
isOne = encoded{1}==1;
[summed(isOne)-encoded{2}(isOne)+1;  summed(isOne)]'

See: http://www.mathworks.com/matlabcentral/fileexchange/4955-rle-deencoding

Alternatively (and slightly faster)

blockEnds = [ find(vect1(1:end-1) ~= vect1(2:end)) length(vect1) ];
blockStarts = [ 1 blockEnds(1:end-1)+1];
isOne = vect1(blockEnds)==1;
[blockStarts(isOne); blockEnds(isOne)]'

Find locations of "TRUE" clusters in a vector of logicals

4 Answers4