I have a dataframe and a column of ordered pairs of (latitude, longitude) as factors within that dataframe that I would like to extract into columns of just the latitude and longitude values separately into numerics. How would I get rid of the commas and parentheses and place the factors into their own columns as numbers?
Asked
Active
Viewed 102 times
-1
-
2As was mentioned in your previous question, please make this question reproducible and self-contained. By that I mean including attempted code (please be explicit about non-base packages), sample representative data (perhaps via `dput(head(x))` or building data programmatically (e.g., `data.frame(...)`), possibly stochastically after `set.seed(1)`), perhaps actual output (with verbatim errors/warnings) versus intended output. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Dec 14 '20 at 21:22
1 Answers
1
Several ways, but I'll focus on strcapture
. My sample data:
somecoords <- c("(1.1,2.2)","(3.3,4.4)")
# if not 'character', then
somecoords <- as.character(somecoords)
strcapture
starts with a vector of strings and returns a data.frame
:
strcapture("\\D*(-?[0-9]+\\.?[0-9]*),(-?[0-9]+\\.?[0-9]*)\\D?.*$",
somecoords, proto = list(num1=0, num2=0))
# num1 num2
# 1 1.1 2.2
# 2 3.3 4.4
Regex walk-through:
\\D*
zero or more non-digit characters(...)
a capture group, saved bystrcapture
into a column-?
a literal dash/hyphen, optional[0-9]+
one or more digits\\.?
literal dot, optional, in case there are whole-number coordinates in your data[0-9]*
zero or more digits,
literal comma\\D?.*
optional non-digit character, followed by zero or more of anything$
end of string (perhaps not required, since.*
should have expanded fully

r2evans
- 141,215
- 6
- 77
- 149