0

I have the longitude and latitude of 5449 trees in NYC, as well as a shapefile for 55 different Neighborhood Tabulation Areas (NTAs). Each NTA has a unique NTACode in the shapefile, and I need to append a third column to the long/lat table telling me which NTA (if any) each tree falls under.

I've made some progress already using other point-in-polygon threads on stackoverflow, especially this one that looks at multiple polygons, but I'm still getting errors when trying to use gContains and don't know how I could check/label each tree for different polygons (I'm guessing some sort of sapply or for loop?).

Below is my code. Data/shapefiles can be found here: http://bit.ly/1BMJubM

library(rgdal)
library(rgeos)
library(ggplot2)

#import data
setwd("< path here >")
xy <- read.csv("lonlat.csv")

#import shapefile
map <- readOGR(dsn="CPI_Zones-NTA", layer="CPI_Zones-NTA", p4s="+init=epsg:25832")
map <- spTransform(map, CRS("+proj=longlat +datum=WGS84"))

#generate the polygons, though this doesn't seem to be generating all of the NTAs
nPolys <- sapply(map@polygons, function(x)length(x@Polygons))
region <- map[which(nPolys==max(nPolys)),]
plot(region, col="lightgreen")

#setting the region and points
region.df <- fortify(region)
points <- data.frame(long=xy$INTPTLON10, 
                     lat =xy$INTPTLAT10,
                     id  =c(1:5449),
                    stringsAsFactors=F)

#drawing the points / polygon overlay; currently only the points are appearing
ggplot(region.df, aes(x=long,y=lat,group=group))+
  geom_polygon(fill="lightgreen")+
  geom_path(colour="grey50")+
  geom_point(data=points,aes(x=long,y=lat,group=NULL, color=id), size=1)+
  xlim(-74.25, -73.7)+
  ylim(40.5, 40.92)+
  coord_fixed()


#this should check whether each tree falls into **any** of the NTAs, but I need it to specifically return **which** NTA
sapply(1:5449,function(i)
  list(id=points[i,]$id, gContains(region,SpatialPoints(points[i,1:2],proj4string=CRS(proj4string(region))))))

#this is something I tried earlier to see if writing a new column using the over() function could work, but I ended up with a column of NAs
pts = SpatialPoints(xy)
nyc <- readShapeSpatial("< path to shapefile here >")
xy$nrow=over(pts,SpatialPolygons(nyc@polygons), returnlist=TRUE)

The NTAs we're checking for are these ones (visualized in GIS): http://bit.ly/1A3jEcE

Community
  • 1
  • 1
kmgong
  • 1

1 Answers1

0

Try simply:

ShapeFile <- readShapeSpatial("Shapefile.shp")
points <- data.frame(long=xy$INTPTLON10, 
                         lat =xy$INTPTLAT10,
                         stringsAsFactors=F)
dimnames(points)[[1]] <- seq(1, length(xy$INTPTLON10), 1)
points <- SpatialPoints(points)

df <- over(points, ShapeFile)

I omitted transformation of shapefile because this is not the main subject here.

statespace
  • 1,644
  • 17
  • 25
  • I tried doing this but it was only giving me NAs? xy$nrow=over(pts,SpatialPolygons(nyc@polygons), returnlist=TRUE) – kmgong Feb 24 '15 at 15:01
  • Try 1) without any arguments - plainly spatial points and spatial polygons. 2) in my applications I have always given name dimension for coordinates as in `dimnames(coords)[[1]] <- name_vector`. I can't see reasons why it is not working. – statespace Feb 24 '15 at 15:04
  • I tried the code here but I seem to be getting table full of NAs? There were a few trees with missing lon/lat in the original xy table but I've changed those to 0s...not sure if that's what's causing the NAs? – kmgong Feb 24 '15 at 17:41