-1

trying to separate values in this character string in R so I get both longitude and latitude.

What's the best way to go about doing this? using a gsub on regex?

e.g.

"POINT (-90.10051372 29.97596117)"

Thanks

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
Jean-Paul Azzopardi
  • 401
  • 1
  • 2
  • 10

4 Answers4

3

If you have multiple strings, you can use strcapture to return a data.frame:

vec <- c("POINT (-90.10051372 29.97596117)", "POINT (-91.10051372 28.97596117)")
strcapture("\\(([-0-9.]+)\\s+([-0-9.]+)", vec, proto = list(lon = 1,lat = 1))
#         lon      lat
# 1 -90.10051 29.97596
# 2 -91.10051 28.97596

Walk-through:

  • Pattern:

    \\(([-0-9.]+)\\s+([-0-9.]+)
    ^^^                         literal left-paren
       ^^^^^^^^^^    ^^^^^^^^^^ groups of pos/neg numbers
                 ^^^^           blank-space
    

    We can add a right-paren \\) to the end of the pattern for good measure, not sure if it adds much.

  • proto= is purely to match each (..) pattern-group with a column name, and the values in the proto define the class/type, the actual values here don't matter (proto=c(lon=999,lat=198282) produce the same results; proto=c(lon="",lat=9) would produce a string class for lon, not what we want/need). If the class used within proto= does not work, for instance the pattern group captures something non-number-like, then it will be NA.

r2evans
  • 141,215
  • 6
  • 77
  • 149
2

Since you are probably dealing with an sf object, here is an sf solution:

library(dplyr)
library(sf)
data.frame(a = "POINT (-90.10051372 29.97596117)") %>% 
  st_as_sf(wkt = "a") %>% 
  st_coordinates()

#          X        Y
#1 -90.10051 29.97596

Or, more lengthy but maybe more flexible:

st_as_sf(data.frame(a = "POINT (-90.10051372 29.97596117)"),
         wkt = "a") %>% 
  mutate(lon = st_coordinates(.)[,1],
        lat = st_coordinates(.)[,2]) %>% 
  st_drop_geometry()
Maël
  • 45,206
  • 3
  • 29
  • 67
  • Short of calling `sf::st_coordinates` twice, can you capture it into a list-column and extract the individual components? – r2evans Sep 23 '22 at 13:13
  • 2
    Actually ... why do you need the `mutate`? Running `st_as_sf(..) %>% st_coordinates()` should return a simple frame (names `X` and `Y`, can use a renaming). (Perhaps OCD or sloppy-looking, but `st_as_sf(..) |> sf::st_coordinates() |> \`colnames<-\`(c("lon", "lat"))` – r2evans Sep 23 '22 at 13:14
  • 1
    Yes. I was about to add this solution. Just figured it out now – Maël Sep 23 '22 at 13:15
  • 1
    This seems likely to be the most "general" in that other SF-like entries (other than `POINT`) will almost certainly be handled transparently/elegantly. I suspect the regex solutions below (including mine) _may_ be able to do it, I don't know enough about the other options and use-case to know how robust regex will be here. Nice. – r2evans Sep 23 '22 at 13:18
1

The trick is in the regex.

x <- "POINT (-90.10051372 29.97596117)"
y <- sub("^[^\\(]+\\(([^\\)]+)\\)$", "\\1", x)
p <- as.numeric(strsplit(y, " ")[[1]])
p
#> [1] -90.10051  29.97596

Created on 2022-09-23 with reprex v2.0.2

Explanation:

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
1

Another approach:

gsub('([A-Z]+ \\()(-?[0-9.]+)\\s([0-9.]+)\\)', '\\2 \\3', "POINT (-90.10051372 29.97596117)")
[1] "-90.10051372 29.97596117"
Karthik S
  • 11,348
  • 2
  • 11
  • 25