8

I've got a vector and I want to calculate the moving average of it (using a window of width 5).

For instance, if the vector in question is [1,2,3,4,5,6,7,8], then

  • the first entry of the resulting vector should be the sum of all entries in [1,2,3,4,5] (i.e. 15);
  • the second entry of the resulting vector should be the sum of all entries in [2,3,4,5,6] (i.e. 20);
  • etc.

In the end, the resulting vector should be [15,20,25,30]. How can I do that?

jub0bs
  • 60,866
  • 25
  • 183
  • 186
user3034281
  • 287
  • 1
  • 3
  • 7

3 Answers3

19

The conv function is right up your alley:

>> x = 1:8;
>> y = conv(x, ones(1,5), 'valid')

y =
    15    20    25    30

Benchmark

Three answers, three different methods... Here is a quick benchmark (different input sizes, fixed window width of 5) using timeit; feel free to poke holes in it (in the comments) if you think it needs to be refined.

conv emerges as the fastest approach; it's about twice as fast as coin's approach (using filter), and about four times as fast as Luis Mendo's approach (using cumsum).

enter image description here

Here is another benchmark (fixed input size of 1e4, different window widths). Here, Luis Mendo's cumsum approach emerges as the clear winner, because its complexity is primarily governed by the length of the input and is insensitive to the width of the window.

enter image description here

Conclusion

To summarize, you should

  • use the conv approach if your window is relatively small,
  • use the cumsum approach if your window is relatively large.

Code (for benchmarks)

function benchmark

    clear all
    w = 5;                 % moving average window width
    u = ones(1, w); 
    n = logspace(2,6,60);  % vector of input sizes for benchmark
    t1 = zeros(size(n));   % preallocation of time vectors before the loop
    t2 = t1;
    th = t1;

    for k = 1 : numel(n)

        x = rand(1, round(n(k)));  % generate random row vector

        % Luis Mendo's approach (cumsum)
        f = @() luisMendo(w, x);
        tf(k) = timeit(f);

        % coin's approach (filter)
        g = @() coin(w, u, x);
        tg(k) = timeit(g);

        % Jubobs's approach (conv)
        h = @() jubobs(u, x);
        th(k) = timeit(h);
    end

    figure
    hold on
    plot(n, tf, 'bo')
    plot(n, tg, 'ro')
    plot(n, th, 'mo')
    hold off
    xlabel('input size')
    ylabel('time (s)')
    legend('cumsum', 'filter', 'conv')

end

function y = luisMendo(w,x)
    cs = cumsum(x);
    y(1,numel(x)-w+1) = 0; %// hackish way to preallocate result
    y(1) = cs(w);
    y(2:end) = cs(w+1:end) - cs(1:end-w);
end

function y = coin(w,u,x)
    y = filter(u, 1, x);
    y = y(w:end);
end

function jubobs(u,x)
    y = conv(x, u, 'valid');
end

function benchmark2

    clear all
    w = round(logspace(1,3,31));    % moving average window width 
    n = 1e4;  % vector of input sizes for benchmark
    t1 = zeros(size(n));   % preallocation of time vectors before the loop
    t2 = t1;
    th = t1;

    for k = 1 : numel(w)
        u = ones(1, w(k));
        x = rand(1, n);  % generate random row vector

        % Luis Mendo's approach (cumsum)
        f = @() luisMendo(w(k), x);
        tf(k) = timeit(f);

        % coin's approach (filter)
        g = @() coin(w(k), u, x);
        tg(k) = timeit(g);

        % Jubobs's approach (conv)
        h = @() jubobs(u, x);
        th(k) = timeit(h);
    end

    figure
    hold on
    plot(w, tf, 'bo')
    plot(w, tg, 'ro')
    plot(w, th, 'mo')
    hold off
    xlabel('window size')
    ylabel('time (s)')
    legend('cumsum', 'filter', 'conv')

end

function y = luisMendo(w,x)
    cs = cumsum(x);
    y(1,numel(x)-w+1) = 0; %// hackish way to preallocate result
    y(1) = cs(w);
    y(2:end) = cs(w+1:end) - cs(1:end-w);
end

function y = coin(w,u,x)
    y = filter(u, 1, x);
    y = y(w:end);
end

function jubobs(u,x)
    y = conv(x, u, 'valid');
end
Community
  • 1
  • 1
jub0bs
  • 60,866
  • 25
  • 183
  • 186
  • 2
    Taking a look with R2016b: the story is pretty much the same. However, R2016a introduced the [`movmean`](https://www.mathworks.com/help/matlab/ref/movmean.html) builtin. For the small window size case its performance is roughly on par with the `filter` approach ([though slightly noisy](http://i.imgur.com/GEeQWMI.png)). For the large window size case its performance is on par with `cumsum`. – sco1 Feb 15 '17 at 14:45
4

Another possibility is to use cumsum. This approach probably requires fewer operations than conv does:

x = 1:8
n = 5;
cs = cumsum(x);
result = cs(n:end) - [0 cs(1:end-n)];

To save a little time, you can replace the last line by

%// clear result
result(1,numel(x)-n+1) = 0; %// hackish way to preallocate result
result(1) = cs(n);
result(2:end) = cs(n+1:end) - cs(1:end-n);
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
  • @Jubobs Why `u = ones(1, 6)`? Shouldn't it be `u = ones(1, w)`? Your three computations of `y` should give the same size. Also, for reliable timing use `timeit` – Luis Mendo Nov 18 '14 at 12:10
  • @Jubobs If you update your benchmarking (BTW +1 already for the effort), could you use my second version? – Luis Mendo Nov 18 '14 at 12:16
  • 1
    Yes, that `6` is a typo; I'm not sure how it got there. I'll rerun the benchmark later. I don't have access to MATLAB right now, but I'll do that (with `timit`) when I get the chance. – jub0bs Nov 18 '14 at 12:34
  • @Jubobs I see. The advantage of `cumsum` is only apparent for larger values of `w`. For example, for `w=50` it is indeed the fastest method (on my machine). Good benchmarking job! – Luis Mendo Nov 18 '14 at 23:27
  • Interesting. I'll rerun the benchmark, but I'll make the window width vary. Thanks for telling me about `timeit`; I had been using `tic`/`toc` all along. – jub0bs Nov 18 '14 at 23:29
  • @Jubobs So did I until recently; but `timeit` [does a better job](http://stackoverflow.com/questions/18952552/timing-code-in-matlab) at timing – Luis Mendo Nov 18 '14 at 23:33
  • @Jubobs Interesting results! It makes sense that the `cumsum` approach is insensitive to window size. It does approximately `n` sums and `n` subtractions, no matter what the window size is (in fact, that savings in computations is what led me to think it could be an interesting approach) – Luis Mendo Nov 20 '14 at 22:08
  • 1
    Yes, now that you've spelled it out to me, it all makes sense. Great! The overall conclusion is that you're better off using `cumsum` if your window is large, but you should use `conv` if your window is narrow. – jub0bs Nov 21 '14 at 10:51
  • @Jubobs Yes, that sums it up very well – Luis Mendo Nov 21 '14 at 11:36
  • @LuisMendo I used this [here](http://codegolf.stackexchange.com/a/100405/31516)! :) – Stewie Griffin Nov 25 '16 at 23:33
  • @StewieGriffin Hehe, well done! This reminds me that my submission to that challenge is also safe now... – Luis Mendo Nov 25 '16 at 23:38
3

If you want to preserve the size of your input vector, I suggest using filter

>> x = 1:8;
>> y = filter(ones(1,5), 1, x)

y =
     1     3     6    10    15    20    25     30

>> y = (5:end)

y =
     15    20    25    30
jub0bs
  • 60,866
  • 25
  • 183
  • 186
nowox
  • 25,978
  • 39
  • 143
  • 293
  • Note that you're using `filter` incorrectly. The syntax is `filter(b,a,x)`, so you should use `filter(ones(1,5), 1, x)`, instead. You should also discard the first 4 elements of the result afterwards. – jub0bs Nov 18 '14 at 10:36