0

I'm trying to vectorize and split up a FOR loop to make it run faster but the variable "aa_sig_combined_vect" begins to return nothing but zeros after cell 5569 any idea how to fix this? see code below that The user krisdestruction helped me with:

Please note I'm using Ubuntu 14.04 with Octave 3.8.1 which is like matlab but is missing some commands unfortunately the parfor command isn't fully implemented in this version of octave.

t=rand(1,556790);
inner_freq=rand(8193,6);

N=100; % use N chunks
nn = int32( linspace(1, length(t)+1, N+1) );
aa_sig_combined_vect=zeros(size(t));
total_time_so_far=0;

D = diag(inner_freq(1:end-1,2));
A = inner_freq(1:end-1,1);
for ii=1:N
    ind = nn(ii):nn(ii+1)-1;
    tic;
    cosPara = 2 * pi * A * t(ind);
    toc;
    cosResult = cos( cosPara );
    sumParaA = D * cosResult;
    toc;
    sumParaB = repmat(inner_freq(1:end-1,3),[1 length(ind)]);
    toc;
    aa_sig_combined_vect(ind) = sum( sumParaA + sumParaB );
    toc;
    total_time_so_far=total_time_so_far+sum(toc)
    return;
end
fprintf('- Complete  test in %4.4fsec or %4.4fmins\n',total_time_so_far,total_time_so_far/60);

The original working loop I'm trying to improve the speed of is below

clear all,
t=rand(1,556790);
inner_freq=rand(8193,6);

N=100; # use N chunks
nn = int32(linspace(1, length(t)+1, N+1))
aa_sig_combined=zeros(size(t));
total_time_so_far=0;

for ii=1:N
    tic;
    ind = nn(ii):nn(ii+1)-1;
    aa_sig_combined(ind) = sum(diag(inner_freq(1:end-1,2)) * cos(2 .* pi .* inner_freq(1:end-1,1) * t(ind)) .+ repmat(inner_freq(1:end-1,3),[1 length(ind)]));
    toc
    total_time_so_far=total_time_so_far+sum(toc)
end
fprintf('- Complete  test in %4.4fsec or %4.4fmins\n',total_time_so_far,total_time_so_far/60);

RMSERepmat = sqrt(mean((aa_sig_combined-aa_sig_combined_vect).^2)) %root men square error between two arrays lower is better
Rick T
  • 3,349
  • 10
  • 54
  • 119

1 Answers1

1

Well, the reason is fairly obvious. return has been added to the loop, breaking it after the first iteration. I found the answer where this code originates, and you might have noted this:

The return is used to break it after the first iteration as it looks like the rest of the iterations are similar.

More generally: Adding tic/toc in the loop is actually going to slow things down. Anything that prints to the screen slows your code down. There is inbuilt profiling in both MATLAB and Octave which should be used for trying to figure out what your bottlenecks are.

Also, this line doesn't change during the loop, because inner_freq doesn't and while ind does, length(ind) should be the same:

repmat(inner_freq(1:end-1,3),[1 length(ind)]);

So you can move that out as well, and avoid multiple calls to repmat.

Community
  • 1
  • 1
nkjt
  • 7,825
  • 9
  • 22
  • 28
  • thanks for the help I did what you said and when I move repmat(inner_freq(1:end-1,3),[1 length(ind)]); out of the for loop it runs for about 7 iterations then I get an error. "error: test_vector_speed.m: operator +: nonconformant arguments (op1 is 8192x5567, op2 is 8192x5568) error: evaluating argument list element number 1 error: called from: error: /home/rt/Documents/octave/eq_research/main/transform/test_vector_speed.m at line 46, column 28." which points to a problem in the line aa_sig_combined_vect(ind) = sum( sumParaA + sumParaB ); – Rick T Jun 25 '15 at 10:53
  • that's because your number of samples isn't evenly dividable. You could add some error checking to catch that and then subindex e.g. use repmat once to make a 8192x5568 and then use something like `sumParaB(:,1:length(ind))` – nkjt Jun 26 '15 at 14:22