I have this input (sample):
input <- tibble(
minimum_term = c("None", "6 Months", "12 Months"),
maximum_term = c("None", "18 Months", "24 Months"),
other_cols
)
and I would like to get to this output:
desired_output <- tibbe(
minimum_term = c(0, 6, 12),
maximum_term = c(0, 18, 24),
other_cols
)
How could I write the following more succinctly (maybe in a function and using purrr::map
?)
library(dplyr)
library(stringr)
input <- input %>%
mutate(minimum_term = str_replace(
minimum_term,
'None',
"0"
)
)
input <- input %>%
mutate(minimum_term = str_extract(minimum_term, '[0-9]{1,2}'))
output <- input %>%
mutate(minimum_term = as.numeric(minimum_term))
- The first operation is to take
minimum_term
from data frameinput
and replace all instances of "None" with "0". - Second operation is then to extract the numbers.
- Third is to convert to a numeric.
I have more columns similar to minimum_term
so I'm keen to put this into a pipeable function and use purrr
, but unsure how to do this, my first try:
term_replacement <- function(df, x){
df <- df %>%
mutate(x = str_replace(
x,
'None',
"0"
)
)
df <- df %>%
mutate(x = str_extract(x, '[0-9]{1,2}'))
df <- df %>%
mutate(x = as.numeric(x))
}