4

I have a set of N measurements taken at S time points (the time points are different for different measurements). I have two matrices:

V - an NxS matrix that represents the measurement values

T - an NxS matrix that represents the measurement times

I would like to generate a matrix VI that represents the linearly interpolated measurements at times TI. A non-vectorized version of the code follows:

tic;
VI = zeros([size(V,1), size(TI,2)]);
for j = 1:size(V,1)
    VI(j,:) = interp1(T(j,:),V(j,:),TI);
end
toc;

I would like to rewrite this code to eliminate the for-loop so that it is implemented using matrix operations and functions. Can it be vectorized?

Marc
  • 5,315
  • 5
  • 30
  • 36
  • Please supply some data to improve your solution. The profile will allow us to understand where the bottleneck is. – Andrey Rubshtein Sep 20 '12 at 07:44
  • @Andrey - Sorry. The point of this question wasn't "how can I speed up my code?" The point was - "can this be done using matrix operations/functions" eliminating the for-loop. I'll edit the question to remove reference to the execution time, which is something of a red herring. – Marc Sep 20 '12 at 14:17
  • You see, loops are not evil. You said that performance is not your primary interest. What is the reason that you want to vectorize? Do you want a clearer code for the reader? Maybe to save memory space? – Andrey Rubshtein Sep 20 '12 at 14:29
  • (1) clarity of code; (2) proper use of the matlab language and built-ins; (3) possibility of substantial (order of magnitude) performance improvements if a vectorized built-in exists. – Marc Sep 20 '12 at 14:41
  • (1) - I think that your code is excellent and clear. About (2) - This is very subjective. About (3) - Maybe you can improve performance *without* vectorization, worth giving it a try! – Andrey Rubshtein Sep 20 '12 at 14:46

4 Answers4

2

It is hard to say anything without a data and running a profiler, but if your data is sorted you can use interp1q instead of interp , which does not do any checks on the data.

Taken from Matlab help:

For interp1q to work properly, x must be a monotonically increasing column vector. Y must be a column vector or matrix with length(x) rows. xi must be a column vector

Andrey Rubshtein
  • 20,795
  • 11
  • 69
  • 104
  • 1
    thanks for the suggestion; substituting interp1q shaved 15% off the execution time. – Marc Sep 20 '12 at 15:20
1

Matlab packs data in columns rather than rows, so I suspect you will see an improvement in speed simply from changing around the loop from going over rows to going over columns:

[N, S] = size(V);

VT = V';                             % Value series in columns
TT = T';                             % Time series in columns
VIT = zeros(length(TI), N);          % Interpolated value series in columns

for j = 1:N
    VIT(:, j) = interp1(VT(:, j), TT(:, j), TI);
end

VI = VIT';                           % Interpolated value series in rows

Unfortunately I don't think too much can be done to further improve the performance of this code since the different value series do not have related time series. If they had the same times, so that we could collapse T into a vector with length(T) = S, then it would be easy to do this:

VIT = interp1(VT, T, TI);            % (see VIT and VT from above)
Brian L
  • 3,201
  • 1
  • 15
  • 15
1

If you have, as you say, different time of measurement for all measurements (T is a matrix and not a vector), you can do what you want with one call to arrayfun as follows:

VI = arrayfun(@(x)(interp1(T(x,:),V(x,:),TI)), 1:size(V, 1), 'UniformOutput', false);
VI = cell2mat(VI');

arrayfun is similar to a loop, but since it is an internal matlab function it might be faster. It returns a cell of vectors, so the second line makes sure that you have a matrix as output. You might not need it - it depends on what you later do with the data.

If on the other hand the measurements have been taken at the same times for different values of N (T is a vector of size S, and not a matrix, or in other words all rows of T are equal) you can interpolate in one call to interp1.

VI = interp1(T(1,:), V', TI)

Here you have to transpose V since interp1 interpolates within columns. This is because MATLAB stores matrices column-wise (columns are contiguous in memory). If you pass V as an SxN matrix it potentially allows for more efficient parallelization of interp1, since all CPUs can access the memory in a more efficient manner. Hence, I would suggest you transpose your matrices in your entire code, unless of course you rely on this exact data layout somewhere else for performance reasons.

Edit Because of the column layout of matrices your original code can be improved by transposing the matrices and working column-wise. The following version is around 20% faster on my computer for N=1000, S=10000 and TI of 10000 elements. It will likely grow with system size due to a more efficient memory bandwidth utilization.

tic;
VI = zeros(size(TI,2), size(V,2));
for j = 1:size(V,2)
    VI(:,j) = interp1(T(:,j),V(:,j),TI);
end
toc;
angainor
  • 11,760
  • 2
  • 36
  • 56
  • tried the array function; did not speed up execution compared to for-loop. thanks for the suggestion! – Marc Sep 20 '12 at 14:38
  • @Marc You could state that speed is the goal in your question. See the updatet answer. – angainor Sep 20 '12 at 15:58
0

I'm at work and so don't have time to familiarize myself with interp1 (I've never used it before), so if the following answer is inappropriate, I apologize in advance.

Having said that, since the iterations of your loop do not depend on each other, vectorization should be possible. It seems to me that you can get rid of the explicit loop using mat2cell and cellfun.

A simple example of what I mean is as follows:

NumRow = 4;
NumCol = 3;
V = randn(NumRow, NumCol);
VCell = mat2cell(V, ones(NumRow, 1), NumCol);
A = cellfun(@sum, VCell);

Of course, what I've done is equivalent to sum(V, 2). But I think the method I've used extends to your situation. The mat2cell function converts V to a cell column vector, where each cell contains a row of V. Then the call to cellfun applies the sum function to each cell in VCell, and returns the result in A. You may be able to simply replace @sum with @interp1 and of course, adjust the inputs to cellfun appropriately.

Let me know if you can't get it to work and I'll try and put together something more explicit once I get home. Also, if you do get it to work, but it doesn't speed things up much, I'd be very interested to know that, so please post the result here.

Colin T Bowers
  • 18,106
  • 8
  • 61
  • 89
  • does using cellfun really represent an improvement over a for-loop? Does MATLAB do something clever and parallel with cellfun? – Marc Sep 20 '12 at 14:21
  • @Marc A worthy question sir! I don't know. That was why I said I would be interested to know if using it sped things up :-) So I got curious and did some basic tests. The results are interesting. I'm going to post them as a new SO question. – Colin T Bowers Sep 21 '12 at 00:23
  • @Marc My new SO question on this topic is [here](http://stackoverflow.com/questions/12522888/arrayfun-can-be-significantly-slower-than-an-explicit-loop-in-matlab-why) – Colin T Bowers Sep 21 '12 at 00:56