1

I have the following vector of logicals:

vect1 = [0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1]

I would like to locate all the 1 value "clusters" within this vector along with their starting and ending indices. For output, I would like to be able to come up with something like:

5 8
13 15
18 19
23 23

where the first number is the "starting" index of each cluster and the second number is the "ending" index of each cluster.

EDIT: I was able to get this to work with a modified version of Shai's answer:

pv = [vect1 0];
sv = [0 pv(1:(end-1))];
ev = [pv(2:end) 0];
starting = find( pv - sv == 1 )
ending = find( pv - ev == 1 )
Mr.Kinn
  • 309
  • 6
  • 18
  • Anything wrong with my working answer? Also using `diff` is going to be much more efficient if your vectors are large. – horchler Aug 11 '13 at 20:54
  • Interesting, I haven't tested larger vectors (which are in fact what I will be using this on). I will have to test those and will let you know. Thanks again! – Mr.Kinn Aug 11 '13 at 20:56
  • I created a vector with 100,000 elements (`vect1 = randi([0 1], 1, 100000)`), and then ran both solutions in a for loop with 1,000 iterations. Your updated answer took 2.3 seconds to execute; Shai's took 1.6 (on repeated tries). 100,000 is about the size of my vectors. Am I still missing something? Thank you once again – Mr.Kinn Aug 11 '13 at 21:10
  • Same test (with code that outputs a matrix as in your question): 2.1415 and 2.1419. These things are version- and OS- and hardware-dependent. You also must time properly or you're measuring the wrong thing -variables must be cleared between calling each version and you should "warm up" by calling the code before timing it. More important: use the code that works for you and that you understand. – horchler Aug 11 '13 at 21:16
  • Could you test the speed of my version posted below too? It avoids some of the copying done in the other solutions and only uses a single find. – Bas Swinckels Aug 11 '13 at 22:40

4 Answers4

4

This question is nearly a duplicate of this one. Adapting my answer from there:

vect1 = [0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1];
v1 = (vect1(:)==1);
d = diff(v1);
output = [find([v1(1);d]==1) find([d;-v1(end)]==-1)]

which returns

output =

     5     8
    13    15
    18    19
    23    23

The two calls to find can be reduced to one with

[output,~] = find([[v1(1);d] [-d;v1(end)]]==1);
output = reshape(output,[length(output)/2 2]);
Community
  • 1
  • 1
horchler
  • 18,384
  • 4
  • 37
  • 73
  • Thank you for your answer! This also works, but Shai's solution is significantly faster (I need to run this code many times). – Mr.Kinn Aug 11 '13 at 20:54
  • Significantly?? I just tested and actually couldn't tell the difference. In any case, `diff` is supposed to be faster (multi-core optimized, etc.), but maybe for simple cases the JIT in recent versions can do a good job. It must be smart enough to figure out that two vector-vector subtractions are identical. – horchler Aug 11 '13 at 21:06
3

To handle the last 1, it would be simpler to pad the vector with a zero:

pv = [vect1 0];
sv = [0 pv];
ev = [pv(2:end) 0];
starting = find( pv - sv == 1 );
ending = find( pv - sv == -1 );
grantnz
  • 7,322
  • 1
  • 31
  • 38
Shai
  • 111,146
  • 38
  • 238
  • 371
  • The code gave me a `Matrix dimensions must agree.` error, so I made some modifications and posted them above. Please let me know if you agree with those. – Mr.Kinn Aug 11 '13 at 20:53
1

This is the simplest one liner I could think of

out =  [find(diff([0 vect1 0])==1); find(diff([0 vect1 0])==-1)-1]'
twerdster
  • 4,977
  • 3
  • 40
  • 70
1

There is a run length encoding function on the Matlab file exchange that I use for this sort of problem. The benefit of this solution (i.e. the rle function) is that it finds repeated blocks without prior knowledge of which values will be repeated.

encoded = rle(vect1);
summed = cumsum(encoded{2});
isOne = encoded{1}==1;
[summed(isOne)-encoded{2}(isOne)+1;  summed(isOne)]'

See: http://www.mathworks.com/matlabcentral/fileexchange/4955-rle-deencoding

Alternatively (and slightly faster)

blockEnds = [ find(vect1(1:end-1) ~= vect1(2:end)) length(vect1) ];
blockStarts = [ 1 blockEnds(1:end-1)+1];
isOne = vect1(blockEnds)==1;
[blockStarts(isOne); blockEnds(isOne)]'
grantnz
  • 7,322
  • 1
  • 31
  • 38