I have looked through many forum posts and have not found an answer. I have a large list of latitudes and longitudes and would like to make a grid of them and based on that grid have each pair of lat/longs be assigned a cell reference from that grid. Eventually I want to assign values based on the cell reference. E.g. Lat 39.5645
and long -122.4654
fall into grid cell reference 1, the total number of murders in that cell are 16 and assaults are 21. There is a better way to do this but this is the only way I know of.
#number of segments, this determines size of grid
segments <- 5
#use these to dvide up the arrays
Xcounter <-(max(cleantrain$X)-min(cleantrain$X))/segments
Ycounter <-(quantile (cleantrain$Y,.9999)-min(cleantrain$Y))/segments
#arrays created from the counter and lat and longs
Xarray <- as.data.frame(seq(from=min(cleantrain$X), to=max(cleantrain$X), by=Xcounter))
Yarray <-as.data.frame(seq(from=min(cleantrain$Y), to=quantile(cleantrain$Y,.9999), by=Ycounter))
#the max for the latitude is 90 but the .9999 percentile is ~39,
# but I still want the grid to include the 90
Yarray[6,1]<-max(cleantrain$Y)
#create dummy column so I know what the values shouldn't be when I print the results
cleantrain$Area <- seq(from =1, to=nrow(cleantrain), by =1)
#for loop that goes through once for each row in my data
for (k in 1:100) {
#this loop goes through the longitudes
for (i in 1:seg-1) {
#this loop goes though the latitudes
for (j in 1:seg-1){
#should check if the row fits into that grid
if(cleantrain$Y[k] < Yarray[(j+1),1] &&
cleantrain$X[k] < Xarray[(i+1),1] &&
cleantrain$Y[k] >= Yarray[j,1] &&
cleantrain$X[k] >= Xarray[i,1]){
#writes to the row the cell reference
cleantrain$Area[k] <- ((i-1)*segments+j)
}
}
}
}
#check the results
cleantrain$Area[1:100]
if you only write the i value to cleantrain$Area
it will always print 1 instead of 1-5. But the j for loop will print 1-5 like it is supposed to. But if you went into the if statement and switched the i and j loop references, the j would always be 1 and the i would always be 1-5.
Here are my array values
#Yarray
1 37.70788
2 37.73030
3 37.75272
4 37.77514
5 37.79756
6 37.81998
#Xarray
1 -122.5136
2 -122.1109
3 -121.7082
4 -121.3055
5 -120.9027
6 -120.5000
EDIT:
Here are the first 10 lats and longs:
cleantrain$Y[1:10]
[1] 37.77460 37.77460 37.80041 37.80087 37.77154 37.71343 37.72514 37.72756 37.77660 37.80780
cleantrain$X[1:10]
[1] -122.4259 -122.4259 -122.4244 -122.4270 -122.4387 -122.4033 -122.4233 -122.3713 -122.5082 -122.4191