2

I've been stuck with subsequent matching of time-series in MATLAB (I'm new to it).

I have two time-series: A (of length a) and B (of length b). Assume that a is much larger than b. The task is to find the closest window from A to B (according to Euclidian metric).

In order to do that I construct additional matrix C that stores all subsequences of the length b from A and then use pdist2(C, B). Obviously it works slowly and requires too much memory.

So I have a couple of questions:

  1. How to obtain C without loops (actually to reshape A)?

  2. What are the common ways to solve this problem? (preferably in MATLAB but other environments are also possible)

Thanks for your help!

Max
  • 441
  • 2
  • 7
  • 14

3 Answers3

1

For part 2 of your question, a typical way of comparing sequences is through Dynamic Time Warping (DTW). You should almost certainly be able to Google for a Matlab implementation.

The basic version of the DTW algorithm has complexity O(nm), but approximate versions that typically have comparable performance have complexity closer to O(max(n, m)).

user1149913
  • 4,463
  • 1
  • 23
  • 28
1

For the first question you could try

tmp = repmat(A,1,b);
C = reshape([tmp zeros(1,b)],a,b);
C = C(1:(a-b+1),:);

Besides, pdist2 is very slow in comparison to this very nice solution: Efficiently compute pairwise squared Euclidean distance in Matlab

Community
  • 1
  • 1
jolo
  • 423
  • 3
  • 10
  • 1
    Thanks! I've got significant acceleration reshaping time-series and I will try to avoid pdist2. P.S. It must be tmp instead of B in the second row – Max Feb 06 '15 at 20:07
  • This is a nice approach to create all shifted versions of a signal, but it took me a while to get the idea with the added zeros - maybe you could add some explanation? – mbschenkel Feb 07 '15 at 09:32
  • Zeros are needed to get the proper dimensions of a matrix after reshaping. As I use pdist2, I need all the signals to be the same length. Actually they are cut anyway. – Max Feb 08 '15 at 08:03
0

I would like to suggest cross-correlation (xcorr) as an approach to this problem. On how cross-correlation and euclidian distance are related, refer for instance to the introduction of this article. It is not invariant to scaling in time or amplitude and may be sensitive to noise, but the question does not imply any such distortions.

An advantage of cross-correlation is its efficient implementation in the transform domain. Unfortunately I have only an old Matlab version without pdist2 at hands, so I can not time it. But consider

%// Parameters
a = 1e4;
b = 1e2;
noise = 0.1;

%// Create sample signals with some distortion
A = rand(1, a);
Offset_actual = 321
B = A(Offset_actual + [1:b]) + noise*rand(1, b);

%// Computation
CC = xcorr(A, B);
[m, i] = max(CC);
Offset_estimated = i - a
plot(CC)

which should recover Offset_estimated == Offset_actual.

mbschenkel
  • 1,865
  • 1
  • 18
  • 40