1

I am not sure why i am keep getting NA whenever I run the Over function with Latitude and Longitude point on the polygon from shapefile. Please note that this is first time for me doing the spatial analysis, but I have done my research and replicated things, but didn't succeed. I need some points which are outside of the polygon to be NA, so I can focus on the real data.

I read these sources since these pertain to my cause but I can't work my problem out:
sp::over() for point in polygon analysis
https://gis.stackexchange.com/questions/133625/checking-if-points-fall-within-polygon-shapefile
https://gis.stackexchange.com/questions/278723/r-error-in-checking-point-inside-polygon

Here is my code chunk

library(sp)
library(rgdal)
library(readr)

gainsville_df <- read_csv("311_Service_Requests__myGNV_.csv")
gnv <- readOGR("~\\Downloads\\GIS_cgbound", layer = "cgbound")

gnv_latlon <- spTransform(gnv, CRS("+proj=longlat +ellps=WGS84 +datum=WGS84"))

gnv_raw <- data.frame(Longitude= gainsville_df$Longitude, Latitude= gainsville_df$Latitude)

coordinates(gnv_raw) <- ~Longitude + Latitude
proj4string(gnv_raw) <- proj4string(gnv)
over(gnv_raw, as(gnv,"SpatialLinesDataFrame"))

#Yeilds:
#  FID_cgboun Id Perimeter Area Acres Hectares Shape_Leng
#1         NA NA        NA   NA    NA       NA         NA

# Desired Output:
# Whereas I should have seen which gainesville Latitudes and Longitude are within the shpaefile
# polygon so I can drop the outliers, that have the NA. According to this, none of my LatLon points 
# are inside the polygon.

The datafiles are here:
Shapefile: https://github.com/THsTestingGround/SO_readOGR_quest/tree/master/GIS_cgbound
reading csv file: https://github.com/THsTestingGround/SO_readOGR_quest/blob/master/311_Service_Requests__myGNV_.csv

I would appreciate if someone can help me out.

WannabeSmith
  • 435
  • 4
  • 18

1 Answers1

2

I realized that your point data is an sf object since you have POINT (-82.34323174 29.67058748) as character. Hence, I reconstructed your data first. I assigned a projection here as well.

library(tidyverse)
library(sf)
library(RCurl)

url <- getURL("https://raw.githubusercontent.com/THsTestingGround/SO_readOGR_quest/master/311_Service_Requests__myGNV_.csv")

mydf <- read_csv(url) %>% 
        mutate(Location = gsub(x = Location, pattern = "POINT \\(|\\)", replacement = "")) %>% 
        separate(col = "Location", into = c("lon", "lat"), sep = " ") %>% 
        st_as_sf(coords = c(3,4)) %>% 
        st_set_crs(4326)

I imported your shapefile using sf package since your data (mydf in this demonstration) is an sf object. When I imported the data, I realized that I had LINESTRING, not polygons. I believe this is the reason why over() did not work. Here I created polygons. Specifically, I joined all seven polygons all together.

mypoly <- st_read("cgbound.shp") %>% 
          st_transform(crs = 4326) %>% 
          st_polygonize() %>% 
          st_union()

Let's check how your data points and polygon are like. You surely have data points staying outside of the polygon.

ggplot() +
geom_sf(data = mypoly) +
geom_point(data = mydf, aes(x = Longitude, y = Latitude))

enter image description here

You said, "I need some points which are outside of the polygon to be NA." So I decided to create a new column in mydf using st_intersects(). If a data point stays in the polygon, you see TRUE in the new column, check. Otherwise, you see FALSE.

mutate(mydf,
      check = as.vector(st_intersects(x = mydf, y = mypoly, sparse = FALSE))) -> result

Finally, check how data points are checked.

ggplot() +
geom_sf(data = mypoly) +
geom_point(data = result, aes(x = Longitude, y = Latitude, color = check))

enter image description here

If you wanna use over() mixing with this sf way, you can do the following.

mutate(mydf,
       check = over(as(mydf, "Spatial"), as(mypoly, "Spatial")))

The last thing you wanna do is to subset the data

filter(result, check == TRUE)

THE SIMPLEST WAY

I demonstrated you how things are working with this sf approach. But the following is actually all you need. st_filter() extracts data points staying in mypoly. In this case, data points staying outside are removed. If you do not have to create NAs for these points, this is much easier.

st_filter(x = mydf, y = mypoly, predicate = st_intersects) -> result2

ggplot() +
geom_sf(data = mypoly) +
geom_sf(data = result2)

enter image description here

jazzurro
  • 23,179
  • 35
  • 66
  • 76
  • 2
    @AgentSmith You are welcome. It is up to you if you wanna use sp or sf package in the end. But sf package makes your life a lot easier in my view. – jazzurro Mar 08 '20 at 04:53
  • 1
    Sorry, I as following several examples over the web since first time for me doing this type of stuff. But I will keep in mind for sure, in fact, I would go on YouTube now, and see some `sf` tutorial. – WannabeSmith Mar 08 '20 at 04:56
  • 1
    @AgentSmith Good luck. I myself have more things to learn. :) – jazzurro Mar 08 '20 at 05:13