-1

I have a dataframe and a column of ordered pairs of (latitude, longitude) as factors within that dataframe that I would like to extract into columns of just the latitude and longitude values separately into numerics. How would I get rid of the commas and parentheses and place the factors into their own columns as numbers?

Buchlord
  • 11
  • 2
  • 2
    As was mentioned in your previous question, please make this question reproducible and self-contained. By that I mean including attempted code (please be explicit about non-base packages), sample representative data (perhaps via `dput(head(x))` or building data programmatically (e.g., `data.frame(...)`), possibly stochastically after `set.seed(1)`), perhaps actual output (with verbatim errors/warnings) versus intended output. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Dec 14 '20 at 21:22

1 Answers1

1

Several ways, but I'll focus on strcapture. My sample data:

somecoords <- c("(1.1,2.2)","(3.3,4.4)")
# if not 'character', then
somecoords <- as.character(somecoords)

strcapture starts with a vector of strings and returns a data.frame:

strcapture("\\D*(-?[0-9]+\\.?[0-9]*),(-?[0-9]+\\.?[0-9]*)\\D?.*$",
           somecoords, proto = list(num1=0, num2=0))
#   num1 num2
# 1  1.1  2.2
# 2  3.3  4.4

Regex walk-through:

  • \\D* zero or more non-digit characters
  • (...) a capture group, saved by strcapture into a column
  • -? a literal dash/hyphen, optional
  • [0-9]+ one or more digits
  • \\.? literal dot, optional, in case there are whole-number coordinates in your data
  • [0-9]* zero or more digits
  • , literal comma
  • \\D?.* optional non-digit character, followed by zero or more of anything
  • $ end of string (perhaps not required, since .* should have expanded fully
r2evans
  • 141,215
  • 6
  • 77
  • 149