1

Data looks like this:

ID Lat Long Time
1  3   3    00:01
1  3   4    00:02
1  4   4    00:03
2  4   3    00:01
2  4   4    00:02
2  4   5    00:03
3  5   2    00:01
3  5   3    00:02
3  5   4    00:03
4  9   9    00:01
4  9   8    00:02
4  8   8    00:03
5  7   8    00:01
5  8   8    00:02
5  8   9    00:03

I want to measure how far the IDs are away from each other within a given radius at each given time interval. I am doing this on 1057 ID's across 16213 time intervals so efficiency is important.

It is important to measure distance between points within a radius because if the points are too far away I don't care. I am trying to measure distances between points who are relatively close. For example I don't care how far away ID 1 is from ID 5 but I care about how far ID 4 is from ID 5.

I am using R and the sp package.

bstrain
  • 278
  • 1
  • 9
  • It seems like you have a clear idea of what you want to do and the steps involved. Try out the `dplyr` and `multidplyr` packages. They have a nice structure for processing tabular data and `multidplyr` has parallel processing support. – Rohit Jun 15 '18 at 10:55

1 Answers1

0

For what I can see, there will be repeated values many times. Therefore, I would suggest to calculate the distance for each pair of coordinates only once (even if repeated many times in the df) as a starting point. Than you can filter the data and merge the tables. (I would add it as a comment, but I don't have enought reputation to do so yet).

The first lines would be:

#Creating a DF with no repeated coordinates
df2 <- df %>% group_by(Lat,Long) %>% summarise()

# Calculating Distances
Dist <- distm(cbind(df2$Long,df2$Lat))
André Costa
  • 377
  • 1
  • 11
  • Thank you for the answer but that would mean calculating 1057^2 distances 16213 times. Yes, this would work if my computer had more RAM but I am looking for a more elegant solution. – bstrain Jun 15 '18 at 17:44