1

In Matlab R2010a:

I am familiar with finding values based on criteria as well as finding the first value in a vector that satisfies criteria. However, how does one find X's and not Y's in the following example? In this case, X's are the first values of a group of values that are findable given my criteria, and there are multiple groups like this amidst thousands of junk values.

I have an vector with 10,000 or more values. Let J be junk values, while X and Y are both values my find criteria will pick up. X's are interesting to me because they are the 'first' values of a series of values that satisfy my criteria before becoming J's. Assume that there are hundreds or thousands more J's in between the X's and Y's, but here is a small example

[J,J,J,J,J,J,J,J,J,J,J,X,Y,Y,Y,Y,J,J,J,J,J,J,J,J,J,X,Y,Y,Y,Y,J];
horchler
  • 18,384
  • 4
  • 37
  • 73
  • What are `X` and `Y` really? Just numbers are they all the same as each other? – horchler Aug 06 '13 at 21:40
  • They are all numbers in a time series that are found by a criteria (find all values within timeseries A greater than -whatever & less than -whatever). Due to my sampling rate, there will be many times where within a split second, a group of values will be found, but I only care about the first number of that group, then want to move on past that group and find the next group, and so on. Within the group that is found, yes they might be the same number. They also might be only slightly different. Because of that, they will be caught by my find, but I only want the first value of each group – Victoria Westwood Aug 06 '13 at 22:25

2 Answers2

4

Assuming that you're not doing something to strange with those Xs and Ys, this is quite easy. You just need to find the beginning of each cluster:

% Create data using your example (Y can equal X, but we make it different)
J = 1; X = 2; Y = 3;
A = [J,J,J,J,J,J,J,J,J,J,J,X,Y,Y,Y,Y,J,J,J,J,J,J,J,J,J,X,Y,Y,Y,Y,J];

a0 = (A==X);                      % Logical indices of A that match X condition
start = find([a0(1) diff(a0)]==1) % Start index of each group beginning with X
vals = A(start)                   % Should all be equal to X

which returns

start =

     12    26


vals =

      2     2

The J values don't even need to be all the same, just not equal to what ever you're detecting as X. You might also find my answer to this similar question helpful.

Community
  • 1
  • 1
horchler
  • 18,384
  • 4
  • 37
  • 73
  • In the example you provide, it works even when X and Y are equal (which is great, as this happens in my timeseries). a0 detects all the values in my timeseries that match the X(or Y) condition. This is fine. However, I encounter a problem: With my timeseries as A, the start variable results in: Empty matrix: 1-by-0. This answer helped a lot but I'm still not there quite yet! – Victoria Westwood Aug 06 '13 at 22:20
  • @user2146356: That means that `find` didn't find and any matches. What is your `X` in reality? What are you really plugging in? It's impossible to work in abstraction. Can you give an example of a vector where this doesn't work. – horchler Aug 06 '13 at 22:41
  • With a slightly simplified X, it is any value within a 1x63200 timeseries that is less than .004 (the values range from negative numbers to 1.7 or so. I'd paste my 1x63200 vector because it doesn't work, but that might be too large? – Victoria Westwood Aug 06 '13 at 22:53
  • @user2146356: So you have a fluctuating signal and you want to capture the index of each time the signal drops below a value of 0.004 (and/or rises above 0.004)? Was that your actual question? – horchler Aug 06 '13 at 22:58
  • I think? Is it a different question? – Victoria Westwood Aug 06 '13 at 23:11
  • So if you replace the line `a0 = (A==X);` with `a0 = (timeseries>0.004);` do you capture the points you want? – horchler Aug 06 '13 at 23:12
  • Depending on what your actual signal is, you may want `timeseries<0.004` instead. And you may need to simplify the next line to `start = find(diff(a0)==1)+1` in order to avoid capturing the the first point in your time series. Can you see how `diff` is being used to detect changes in the logical criteria vector? – horchler Aug 06 '13 at 23:21
  • I capture all points below .004, but I don't want all those points. Just the first point within a group of points that satisfy the a0 criteria. I'm left with [7455, 7457, 13265, 13267, 26653, 30761, 30763, 40312, 40314, 42294, 48410, 60576, 60578] So I need to weed out 7457, 13267, 30763, 40314, 60578], as they are not the first values within their groups (see how sample 7455 and 7457 come up as being less than .004, and they are about 6000 samples away from the next values, hence, a group of values). – Victoria Westwood Aug 06 '13 at 23:24
  • 1
    Yes, but what is the value of sample 7456? I'm pretty sure that it jumps back above your threshold. Plot these out along with a line at your threshold. You have noisy data. Now we're decidedly in the realm of another question. – horchler Aug 06 '13 at 23:30
  • Yes indeed it does. If it didn't do that the code you provided would probably work. I think I'll have to filter or smooth and then do this method. Thanks for the help. – Victoria Westwood Aug 06 '13 at 23:33
  • Have you tried a double threshold? E.g., something like `a0 = (timeseries<0.004&timeseries>0.002)`? This only captures points in a narrow band. The values of your two thresholds will need to be tuned - they'll depend on how noisy the data is and how big your timesteps are. It depends a lot on how noisy the signal is as it passes through the thresholds. – horchler Aug 06 '13 at 23:41
  • This is what I originally did and works only on an experiment-by-experiment basis. I'm trying to make my code and analysis as simple as possible such that anyone can run it on a dataset without having to look at the values and edit thresholds. – Victoria Westwood Aug 06 '13 at 23:47
  • After implementing a filter and selecting a sensible single threshold point, it works like a charm. Thanks! – Victoria Westwood Aug 08 '13 at 00:22
0
A = [J,J,J,J,J,J,J,J,J,J,J,X,Y,Y,Y,Y,J,J,J,J,J,J,J,J,J,X,Y,Y,Y,Y,J]; % created vector
I = A(A~=J); % separated out all values that are not junk
V = I(I==I(1)); % separated all values that match the first non-junk value
rafee
  • 1,731
  • 18
  • 21
  • Two things. `!=` is not valid Matlab. You probably mean `~=` for "not equal". And once that is fixed, this only returns the values. I'm guessing that @user2146356 will want the actual indices that these occur at. – horchler Aug 06 '13 at 22:50