I have county parcel level shapefiles and I'm aiming to calculate the number of parcels within a mile (about 1610 meters) in total, as well as with the same owner. I've worked through a solution, and below is my sample code, but it is fairly inefficient and memory intensive. I cannot publicly post the data, but here is the problem with some made up code:
library(rgdal)
library(rgeos)
library(geosphere)
nobs<-1000 # number of observations
nowners<-50 # number of different owners
crs<-"+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
long<-runif(nobs,min=-94.70073, max=-94.24141) #roughly adair county in iowa
lat<-runif(nobs,min=41.15712,max=41.50415) #roughly adair county in iowa
coords<-cbind(long,lat)
owner<-sample(1:nowners,nobs, replace=T) # give id's to owners
df<-as.data.frame(owner)
centroids<-SpatialPointsDataFrame(coords,df,proj4string = CRS(crs)) # make spatial dataframe
d<-distm(centroids) # distance from centroids to other centroid
numdif<-matrix(0,length(owner)) #vectors of 0s to be replaced in loop
numtot<-matrix(0,length(owner))
for (i in 1:length(owner)) {
same_id<-df$owner[i]==owner ## identify locations with same owner ID
numdif[i]<-as.numeric(sum(d[i,]<1609.34 & same_id==F)) #different parcel owners
numtot[i]<-as.numeric(sum(d[i,]<1609.34)) #total parcels
}
The resulting "numdif" and "numtot" vectors give me what I want: A vector of number of neighboring parcels with different owners and total, respectively. However, this process is incredibly time consuming and memory intensive for my counties that have a far larger "nobs." Some counties have 50-75,000 observations (so the resulting matrix m has billions of elements and will likely require far more memory than I have). Does anyone have thoughts about a better way of approaching this problem, from both a speed and memory perspective? Help is greatly appreciated.