1

I've read through many forums regarding this topic but I can't seem to adapt anything I've read to my particular question. Basically, I have a data frame of lat/lon values and all I want to do is test whether these coordinates exist within California.

Here is some example data:

library(tidyverse)
library(sf)

coords <- tribble(
  ~city, ~lon, ~lat,
  LA, -118.2437, 34.0522,
  SF, -122.4194, 37.7749,
  SAC, -121.4944, 38.5816,
  CHI, -87.6298, 41.8781,
  NY, -74.0060, 40.7128
)

And here is a link to the shape files from the state website: CA Shape Files.

I think I'm close...

# read in shape data
cali <- read_sf("CA_State_TIGER2016.shp")

# convert coordinates to spatial point compatible data
coords_sf <- st_as_sf(coords, coords = c("lon", "lat"), crs = st_crs(cali))

From there, I assume I use st_contains to test whether my cali object contains the coordinates found in coords_sf but I can't get it to work.

Any advice?

Thanks for your help!

Sébastien Rochette
  • 6,536
  • 2
  • 22
  • 43
Trent
  • 771
  • 5
  • 19
  • what you can do it an `inner_join` / `merge` on the two files by lat and long. If they are in both files then you know that the lot long pair is in CA – Mike May 11 '20 at 20:26
  • When I do that I get `NA` for all cali values that were joined over. For example, `st_join(coords_sf, cali)` – Trent May 11 '20 at 20:29
  • Please look at [sp::over() for point in polygon analysis](https://stackoverflow.com/q/19002744/4752675) and [Assign polygon to data point in R dataframe](https://stackoverflow.com/q/44020974/4752675) – G5W May 11 '20 at 20:31

1 Answers1

4

In your code, there is a confusion between the original coordinate reference system of your point dataset coords and the crs you want to apply to it.
Note that your dataset named coords is not a spatial dataset. You need to make it a spatial dataset with st_as_sf(). The crs of the coordinates you entered in this dataframe is "geographical coordinates".
Once this is a dataset, you can then transform it to the target crs. In your code, you tried to do both at the same time.

Hence the answer you are looking for is:

library(tidyverse)
library(sf)

coords <- tribble(
  ~city, ~lon, ~lat,
  "LA", -118.2437, 34.0522,
  "SF", -122.4194, 37.7749,
  "SAC", -121.4944, 38.5816,
  "CHI", -87.6298, 41.8781,
  "NY", -74.0060, 40.7128
)

file <- tempfile(fileext = ".zip")
download.file("https://data.ca.gov/dataset/e212e397-1277-4df3-8c22-40721b095f33/resource/3db1e426-fb51-44f5-82d5-a54d7c6e188b/download/ca-state-boundary.zip", destfile = file)
unzip(zipfile = file)

# read in shape data
cali <- read_sf("CA_State_TIGER2016.shp")

# Your data are originally geographical coordinates which has EPSG=4326
coords_sf <- st_as_sf(coords, coords = c("lon", "lat"), crs = 4326)
# Then you can transform them to the system you want
coords_cali <- coords_sf %>% st_transform(crs = st_crs(cali))

cali %>% st_contains(coords_cali)

If you want to add the information of the cali shapefile in your point dataset you can:

  • Keep entire point dataset and put NA
coords_cali %>% 
  st_join(cali)
  • Keep only points that are inside the cali polygon
coords_cali %>% 
  st_intersection(cali)
Sébastien Rochette
  • 6,536
  • 2
  • 22
  • 43
  • This works! But, I have a couple of follow up questions. (1) Why did you set the `crs` to 4326? (2) Is there a way to extract the indices with my "city" identifier into a dataframe? – Trent May 11 '20 at 20:46
  • 4326 is the EPSG number of geographical coordinates. For the others types of extraction, I edited my answer. – Sébastien Rochette May 11 '20 at 20:51
  • This is a fantastic answer! Thank you so much. Last question, can you tell me how you got that number? When I view the `cali` object I see that the CRS is set to 3857. – Trent May 11 '20 at 21:05
  • 1
    Experience :-) 4326 is the most used. You can find the others here: http://epsg.io/ However, do not get used to it because crs will very soon change for WKT: https://www.r-spatial.org/r/2020/03/17/wkt.html – Sébastien Rochette May 11 '20 at 21:09
  • You're not kidding either. I tried setting the `crs` argument to 3857 and it didn't work. Very strange. Thanks again so much for your help. – Trent May 11 '20 at 21:20