I present my simple working Matlab code and will ask questions:
tic
nrand1 = 10000;
nrand2 = 20000;
% Location matrix 1: [longitude, latitude, w1]
lmat1=[rand(nrand1,1)-75 rand(nrand1,1)+39 round(rand(nrand1,1)*1000)+1];
% Location matrix 2: [longitude, latitude, w2]
lmat2=[rand(nrand2,1)-75 rand(nrand2,1)+39 round(rand(nrand2,1)*100)+1];
% The number of rows for each matrix = In fact it's nrand1 X nrand2, obviously
nobs1 = size(lmat1,1);
nobs2 = size(lmat2,1);
% The number of pair-wise distances
% between L1 locations X L2 locations
ndist = nobs1*nobs2;
% Initialization: Distance vector and weight vector
hdist = zeros(ndist,1);
weight = zeros(ndist,1);
% Double for loop -- for calculating the pair-wise distances and weights
k=1;
for i=1:nobs1
for j=1:nobs2
% distances in kilometers.
lonH = sin(0.5*(lmat1(i,1)-lmat2(j,1))*pi/180.0)^2;
latH = sin(0.5*(lmat1(i,2)-lmat2(j,2))*pi/180.0)^2;
hdist(k) = 0.001*6372797.560856*2 ...
*asin(sqrt(latH+(cos(lmat1(i,2)*pi/180.0) ...
*cos(lmat2(j,2)*pi/180.0))*lonH));
weight(k) = lmat1(i,3)*lmat2(j,3);
k=k+1;
end
end
toc
The code calculates 10000 X 20000 distances and weights.
Elapsed time is 67.124844 seconds.
Is there a way to vectorize the double-loop processing, or to perform a parallel computing? If there is no room for performance improvement in Matlab, I may have to write the double loops in C and call it from Matlab. I don't know how to call C from matlab, so I will ask a separate question. Thanks!