I recently got a problem when calculating the spherical distance with large set of locations in matlab. Here is the problem:
1.locations are latitudes and longitudes, stored in a vector locs with dimension nx2 where n = 10^5.
2.I want to calculate the spherical distance between any two locations in the vector locs. Note that the resulted distance matrix may or may not sparse, since I cannot check it without distance. Let's assume that it's sparse, because we can truncate the distance with a number, say 1000. (The units are km), and the resulted distance matrix is enough for my usage.
3.I tried two methods to calculate it, but each of them has drawbacks.
PS: I run the code on Mac OS with MATLAB 2014 student version. The function great_circle_distance() is pretty like Matlab built-in function calculating the great circle distance between any two locations with geospatial coordinates. The problem here is not relevant to usage of this function.
Thanks for all kinds of suggestions in advance.
--Disc
Method1:
dist = zeros(n, 1);
dist_vec = [];
dist_id = [];
dist_id2 = [];
for i =1:n
dist = great_circle_distance(locs(i, :), locs);
dist_temp=(dist<300) &(dist>0); % for example, consider the distance between 0 and 300
[dist_id_temp, dist_id2_temp,dist_vtemp]=find(dist_temp);
dist_vec_temp=dist(dist_id_temp);
dist_id=[dist_id; dist_id_temp];
dist_id2=[dist_id2; dist_id2_temp];
dist_vec=[dist_vec; dist_vec_temp];
if mod(i, 1000) == 0
i
end
end
dist_mat = sparse(dist_id, dist_id2, dist_vec);
The drawback here is that it took long time to finish this, at least 48 hours, but it consumed not too much memory around 2-5G.
Method2:
d = pdist(locs, @great_circle_distance);
[i, j] = find(tril(true(n), -1)); % extract index below main diagonal
d = d';
a = [i, j, d];
con = d == 0; % specify condition.
a(con, :) = []; % delete rows satisfying condition con.
clear i j d;
i = a(:, 1);
j = a(:, 2);
d = a(:, 3);
dist_mat = sparse(i, j, d);
The drawback of this method is that it consumed too much memory (exceeding 15G) when calculating d in the code.