How can I transform a column of characters written as
c("0 y", "0 m", "23 d", "0 y", "0 m", "8 d")
into number values
c(0, 0, 23, 0, 0, 0)
example of what I'm talking about
another example that has some single-digit dates
How can I transform a column of characters written as
c("0 y", "0 m", "23 d", "0 y", "0 m", "8 d")
into number values
c(0, 0, 23, 0, 0, 0)
example of what I'm talking about
another example that has some single-digit dates
Assuming y and m are always 0
Oy.date.diff <- c("0 y, 0 m, 12 d", "0 y, 0 m, 13 d", "0 y, 0 m, 12 d", "0 y, 0 m, 15 d")
as.numeric(gsub(" d", "", gsub("0 y, 0 m, ", "", Oy.date.diff)))
# [1] 12 13 12 15
Note that R does not allow variables (or columns) to begin with a digit so the first character is uppercase letter O.
We can use sub
to capture the digits before the space followed by 'd'
as.integer(sub(".*\\s(\\d+) d", "\\1", v1))
#[[1] 12 13 12 15 12
Or with regmatches/regexpr
regmatches(v1, regexpr("(\\d+)(?= d$)", v1, perl = TRUE))
#[1] "12" "13" "12" "15" "12"
If we need to convert to all days, then
library(dplyr)
library(tidyr)
tibble(col1 = v1) %>%
tidyr::extract(col1, into = c('year', 'month', 'day'),
"^(\\d+) y, (\\d+) m, (\\d+) d$", convert = TRUE) %>%
transmute(days = year * 365 + month * 30 + day)
v1 <- c("0 y, 0 m, 12 d", "0 y, 0 m, 13 d", "0 y, 0 m, 12 d",
"0 y, 0 m, 15 d", "1 y, 2 m, 12 d")
You can try this capturing regex with gsub, which captures any numbers before a " d" and doesn't make any assumptions about the rest of the string:
x <- c("0 y, 0 m, 12 d", "0 y, 0 m, 13 d", "0 y, 0 m, 12 d", "0 y, 0 m, 15 d")
gsub("^.*(\\d+) d.*$", "\\1", x)
#> [1] "2" "3" "2" "5"