5

So I have a monthly returns matrix, in the form of 1000x300. I would like to take the average values of the every 12 columns for each row in the returns matrix to give me annual return, which would eventually lead to a 1000x25 matrix.

How would I go about doing this in Matlab?

Through some quick searching, I believe I can use the reshape function somehow, but I am having trouble figuring out how to implement it in my code's loop.

So far, this is my attempt.

for i = 1:25
Strategy1.MeanReturn(:,i) = mean(Data.Return(:,i+1):Data.Return(:,i*12+1));
end

Fyi, the +1 is there because I am ignoring the first column of the matrix.

But this leads me to getting a singular NaN value.

Stewie Griffin
  • 14,889
  • 11
  • 39
  • 70
rahulk92
  • 85
  • 4

4 Answers4

6

You can stack the desired submatrices along the first dimension of a 3D array, then do the average along that dimension, and squeeze out the resulting singleton dimension:

x = rand(10,20); % example data. 1000x300 in your case
N = 4; % group size. 12 in your case
y = reshape(x.', N, size(x,2)/N, []);
result = squeeze(mean(y,1)).';
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
  • Thanks for your solution Luis! Would you mind explaining what squeeze does possibly? I'm still fairly new and would appreciate it. Cheers :) – rahulk92 Jun 03 '16 at 08:26
  • @rahulk92 `squeeze` removes singleton dimensions of arrays with more than two dimensions. For example, an array of size `4`x`1`x`5` is converted into a `4`x`5` array – Luis Mendo Jun 03 '16 at 11:28
5

try this:

B = zeros(1000,25);
A = rand(1000,300);
for i = 1:25    
    B(:,i) = mean(A(:,(i-1)*12+1:i*12),2); 
end

I just tested it with building a sum of ones and it worked.

bushmills
  • 673
  • 5
  • 17
  • Thank you! I made a small change to fit my data sample, but this was perfect! I did try the transposing the matrix and the ', 2' in one of my attempts but I must have messed up in my for-loop with i. It works perfectly now. Thanks again for the quick response. – rahulk92 Jun 02 '16 at 07:14
  • 1
    @rahulk92, remember to [preallocate memory](http://stackoverflow.com/a/6217182/2338750)! – Stewie Griffin Jun 02 '16 at 08:03
  • edited...thanks Stewie! especially when working with large arrays, preallocation brings a huge effect. – bushmills Jun 02 '16 at 18:11
5

Loops aren't always slow. In fact, tests performed by Mathworks has shown that the speed of loops has improved by 40% as a result of the new and improved Execution Engine (JIT)

The average performance improvement across all tests was 40%. Tests consisted of code that used a range of MATLAB products. Although not all applications ran faster with the redesign, the majority of these applications ran at least 10% faster in R2015b than in R2015a.

and

The performance benefit of JIT compilation is greatest when MATLAB code is executed additional times and can re-use the compiled code. This happens in common cases such as for-loops or when applications are run additional times in a MATLAB session


A quick benchmark of the three solutions:

%% bushmills answer, saved as bushmills.m
function B = bushmills(A,N)
B = zeros(size(A,1),size(A,2)/N);
for i = 1:size(A,2)/N   
    B(:,i) = mean(A(:,(i-1)*12+1:i*12),2); 
end
end

A = rand(1000,300); N = 12;

%% Luis Mendo's answer:
lmendo = @(A,N) squeeze(mean(reshape(x.', N, size(x,2)/N, []))).';

%% Divakar's answer:
divakar = @(A,N) reshape(mean(reshape(A,size(A,1),N,[]),2),size(A,1),[]);

b = @() bushmills(A,N);
l = @() lmendo(A,N);
d = @() divakar(A,N);

sprintf('Bushmill: %d\nLuis Mendo: %d\nDivakar: %d', timeit(b), timeit(l), timeit(d))
ans =
Bushmill: 1.102774e-03
Luis Mendo: 1.611329e-03
Divakar: 1.888878e-04

sprintf('Relative to fastest approach:\nDivakar: %0.5f\nBushmill: %0.5f\nLuis Mendo: %0.5f', 1, tb/td, tl/td)
ans =
Relative to fastest approach:
Divakar: 1.00000
Bushmill: 5.34464
Luis Mendo: 10.73969

The loop approach (with pre-allocation) is approximately 40% faster than the squeeze(mean(reshape(...))) solution. Divakar's solution beats both by a mile.


It might be different for other values of A and N, but I haven't tested all.

Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
Stewie Griffin
  • 14,889
  • 11
  • 39
  • 70
5

Using the philosophy that reshape is virtually zero cost, here's an approach that basically just uses mean:

% A is the input array of shape (1000,300)

N = 12; %// Group size
M = size(A,1);
out = reshape(mean(reshape(A,M,N,[]),2),M,[]);

Would be interesting to see how it performs against the new JIT!

Community
  • 1
  • 1
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • @StewieGriffin Lovely! I would have to give it to `reshape`'s zero cost magic! :) Thanks alot for the benchmarking! – Divakar Jun 02 '16 at 08:53
  • 1
    Aha, thanks for that! Yeah I am still somewhat of a beginner in Matlab and am just learning about reshape, it's amazing! – rahulk92 Jun 03 '16 at 08:25