2

I looked through the questions that been asked but dealing with coordinates but couldn't find something can help me out with my problem.

I have dataset that contain ID, Speed, Time , List of Latitude & Longitude. ( dataset can be found in the link) https://drive.google.com/file/d/1MJUvM5WEhua7Rt0lufCyugBdGSKaHMGZ/view?usp=sharing

I want to measure the distance between each point of Latitude & Longitude. For example; Latitude has: x1 ,x2 ,x3 ,...x1000

Longitude has: y1 ,y2 ,y3 ,..., y100

I want to measure the distance between (x1,y1) to all the points , and (x2,y2) to all the points, and so on.

The reason I'm doing this to know which point close to which and assign index to each location based on the distance. if (x1, y1) is close to (x4,y4) so (x1, y1) will get the index A for example and (x4,y4) will get labeled as B. sort the points in order based on distance.

I tried gDistance function but showed error message: "package ‘gDistance’ is not available (for R version 3.4.3)"

and if I change the version to 3.3 library(rgeos) won't work !! Any suggestions?

here's what I tried,

#requiring necessary packages:
library(sp)  # vector data
library(rgeos)  # geometry ops

#Read the data and transform them to spatial objects
d <- read.csv("ReadyData.csv")
sp.ReadData <- d
coordinates(sp.ReadyData) <- ~Longitude + Latitude
d <- gDistance(sp.ReadyData, byid= TRUE)

here's update my solution, I created spatial object and made spatial data frame as follow:

#Create spatial object:
lonlat <- cbind(spatial$Longitude, spatial$Latitude)
#Create a SpatialPoints object:
library(sp) 
pts <- SpatialPoints(lonlat)
crdref <- CRS('+proj=longlat +datum=WGS84')
pts <- SpatialPoints(lonlat, proj4string=crdref)
# make spatial data frame
ptsdf <- SpatialPointsDataFrame(pts, data=spatial)

Now I'm trying to measure the Distance for longitude/latitude coordinates. I tried dist method but seems not working for me and tried pointDistance method:

gdis <- pointDistance(pts, lonlat=TRUE)

still not clear for me how this function can measure the distance, I need to figure out the distance so I can locate the point in the middle and assign numbers for each point based on its location from the middle point..

Reta
  • 363
  • 3
  • 4
  • 15
  • 2
    Please only load the packages you actually use in your example. You have about 3 lines of code that do things, you don't need to load 13 packages (and you don't need to include the `install.packages` lines at all. – Gregor Thomas Jan 22 '18 at 19:52
  • I had to install these packages since when I run gdistanacbe function, it asked me to install some library that also required another library, to it's related – Reta Jan 22 '18 at 19:58
  • I found gDistance function through some answers from this website , here is links talking about this function written in the same way : https://stackoverflow.com/questions/26308426/how-do-i-find-the-polygon-nearest-to-a-point-in-r and this another one from different source: https://gis.stackexchange.com/questions/148852/gdistance-to-determine-closest-point-to-polygon-does-changing-projection-change I just needed help with this function. – Reta Jan 22 '18 at 20:44

2 Answers2

3

You can use raster::pointDistance or geosphere::distm among others functions.

Part of your example data (please avoid files in your questions):

d <- read.table(sep=",", text='
"OBU ID","Time Received","Speed","Latitude","Longitude"
"1",20,1479171686325,0,38.929596,-77.2478813
"2",20,1479171686341,0,38.929596,-77.2478813
"3",20,1479171698485,1.5,38.9295887,-77.2478945
"4",20,1479171704373,1,38.9295048,-77.247922
"5",20,1479171710373,0,38.9294865,-77.2479055
"6",20,1479171710373,0,38.9294865,-77.2479055
"7",20,1479171710373,0,38.9294865,-77.2479055
"8",20,1479171716373,2,38.9294773,-77.2478712
"9",20,1479171716374,2,38.9294773,-77.2478712
"10",20,1479171722373,1.32,38.9294773,-77.2477417')

Solution:

library(raster)
m <- pointDistance(d[, c("Longitude", "Latitude")], lonlat=TRUE)

To get the nearest point to each point, you can do

mm <- as.matrix(as.dist(m))
diag(mm) <- NA
i <- apply(mm, 1, which.min)

The point pairs

p <- cbind(1:nrow(mm), i)    

To get the distances, you can do:

mm[p] 

Or do this:

apply(mm, 1, min, na.rm=TRUE)

Note that rgeos::gDistance is for planar data, not for longitude/latitude data.

Here is a similar question/answer with some illustration.

our data set is too large to make a single distance matrix. You can process your data in chunks to with that. Here I am showing that with a rather small chunk size of 4 rows. Make this number much bigger to speed up processing time.

library(geosphere)
chunk <- 4  # rows
start <- seq(1, nrow(d), chunk)
end <- c(start[-1], nrow(d))   
x <- d[, c("Longitude", "Latitude")]

r <- list()
for (i in 1:length(start)) {
    y <- x[start[i]:end[i], , drop=FALSE]
    m <- distm(y, x)
    m[cbind(1:nrow(m),  start[i]:end)] <- NA 
    r[[i]] <- apply(m, 1, which.min)
}
r <- unlist(r)
r
# [1] 2 1 1 5 6 6 5 5 9 8 8 8

So for your data:

d <- read.csv("ReadyData.csv")
chunk <- 100  # rows
# etc

This will take a long time.

An alternative approach:

library(spdep)
x <- as.matrix(d[, c("Longitude", "Latitude")])
k <- as.vector(knearneigh(x, k=1, longlat=TRUE)$nn)
Robert Hijmans
  • 40,301
  • 4
  • 55
  • 63
  • I'm trying to use your example and see the results but for the third time whenever I run this command: m <- pointDistance(d[, c("Longitude", "Latitude")], lonlat=TRUE) It shows me "R Session Aborted" I'm sure R is updated and the dataset size not that big! Have you faced the same problem ? I won't be able to run this code for some reason,, – Reta Jan 23 '18 at 16:15
  • That is very odd. Are you using Rstudio? Can you try in plain R? Can you make sure that you are not loading an old session when you open R? You can try using `geosphere::distm` instead. – Robert Hijmans Jan 23 '18 at 17:46
  • I tried plain R and the same problem happened "R session Aborted" however, this time it also shows a message saying that my system has run out of application memory!! I don't understand why, I have around 88g available.. how much memory you have so the code worked for you? and for geosphere do I use it instead pointDistance? – Reta Jan 23 '18 at 18:42
  • Yes, `dism` replaces `pointDistance`. The example should hardly use any memory. Very odd. – Robert Hijmans Jan 23 '18 at 19:37
  • Thanks Robert! will try and see how it goes.. really appreciate your help – Reta Jan 23 '18 at 20:39
  • I got the problem , It's memory limitation where I get this message when I tried using R on Windows : "Error: cannot allocate vector of size 473.5 Gb", how much memory you have please ? – Reta Jan 23 '18 at 23:41
  • It makes no sense at all that this would happen when you try the example. It is very small. It could happen with your actual data, if it is a large data set. In that case, you would have to process the data in chunks. – Robert Hijmans Jan 24 '18 at 04:57
  • I was talking about the whole dataset not the example, sorry for the misunderstanding! I will try to sampling my dataset and see how it works.. thanks! – Reta Jan 24 '18 at 19:58
  • I have updated my answer to show how to deal with a large dataset – Robert Hijmans Jan 24 '18 at 20:41
  • but I'll still have memory limitation problem! I just created a sample data of my whole dataset and want to measure the distance, I found some tutorials talking about using: dist function as : dist(data) or should I go and use pointDistance.. I'm already done with creating SpatialPointsDataFrame object, want to start the analysis and measure the distance – Reta Jan 24 '18 at 21:15
  • You cannot use `dist`. Please edit your question to show what you are doing. – Robert Hijmans Jan 24 '18 at 22:58
  • Just did! I'm reading about measure distance in longitude/latitude coordinates but can't a clear explanation for this method – Reta Jan 24 '18 at 23:52
  • Right, but you are not following my example, let alone my example for a big dataset – Robert Hijmans Jan 25 '18 at 00:14
  • I just did now apply your example only dataset that has 5000 observations, but in your example you don't convert data type to spatial data frame! I read that we need first to make spatial points and then make spatial data frame and measure the distance, correct me if I got your example wrong. plus after running your example on my dataset, which value I can get to shoe me the distance ??! I apologize but I'm learning and working on a project at the same time ,, appreciate your help – Reta Jan 25 '18 at 00:40
0

Assuming you have p1 as spatialpoints of x and p2 as spatialpoints of y, to get the index of the nearest other point:

ReadyData$cloDist <- apply(gDistance(p1, p2, byid=TRUE), 1, which.min)

If you have the same coordinate in the list you will get an index of the point itself since the closest place to itself is itself. An easy trick to avoid that is to use the second farthest distance as reference with a quick function:

f_which.min <- function(vec, idx) sort(vec, index.return = TRUE)$ix[idx]
ReadyData$cloDist2 <- apply(gDistance(p1, p2, byid=TRUE), 1, f_which.min, 
idx = 2)
Neal Barsch
  • 2,810
  • 2
  • 13
  • 39