trying to separate values in this character string in R so I get both longitude and latitude.
What's the best way to go about doing this? using a gsub on regex?
e.g.
"POINT (-90.10051372 29.97596117)"
Thanks
trying to separate values in this character string in R so I get both longitude and latitude.
What's the best way to go about doing this? using a gsub on regex?
e.g.
"POINT (-90.10051372 29.97596117)"
Thanks
If you have multiple strings, you can use strcapture
to return a data.frame
:
vec <- c("POINT (-90.10051372 29.97596117)", "POINT (-91.10051372 28.97596117)")
strcapture("\\(([-0-9.]+)\\s+([-0-9.]+)", vec, proto = list(lon = 1,lat = 1))
# lon lat
# 1 -90.10051 29.97596
# 2 -91.10051 28.97596
Walk-through:
Pattern:
\\(([-0-9.]+)\\s+([-0-9.]+)
^^^ literal left-paren
^^^^^^^^^^ ^^^^^^^^^^ groups of pos/neg numbers
^^^^ blank-space
We can add a right-paren \\)
to the end of the pattern for good measure, not sure if it adds much.
proto=
is purely to match each (..)
pattern-group with a column name, and the values in the proto define the class/type, the actual values here don't matter (proto=c(lon=999,lat=198282)
produce the same results; proto=c(lon="",lat=9)
would produce a string class for lon
, not what we want/need). If the class used within proto=
does not work, for instance the pattern group captures something non-number-like, then it will be NA
.
Since you are probably dealing with an sf
object, here is an sf
solution:
library(dplyr)
library(sf)
data.frame(a = "POINT (-90.10051372 29.97596117)") %>%
st_as_sf(wkt = "a") %>%
st_coordinates()
# X Y
#1 -90.10051 29.97596
Or, more lengthy but maybe more flexible:
st_as_sf(data.frame(a = "POINT (-90.10051372 29.97596117)"),
wkt = "a") %>%
mutate(lon = st_coordinates(.)[,1],
lat = st_coordinates(.)[,2]) %>%
st_drop_geometry()
The trick is in the regex.
x <- "POINT (-90.10051372 29.97596117)"
y <- sub("^[^\\(]+\\(([^\\)]+)\\)$", "\\1", x)
p <- as.numeric(strsplit(y, " ")[[1]])
p
#> [1] -90.10051 29.97596
Created on 2022-09-23 with reprex v2.0.2
Explanation:
Another approach:
gsub('([A-Z]+ \\()(-?[0-9.]+)\\s([0-9.]+)\\)', '\\2 \\3', "POINT (-90.10051372 29.97596117)")
[1] "-90.10051372 29.97596117"