1

I have this input (sample):

input <- tibble(
  minimum_term = c("None", "6 Months", "12 Months"),
  maximum_term = c("None", "18 Months", "24 Months"),
  other_cols
)

and I would like to get to this output:

desired_output <- tibbe(
  minimum_term = c(0, 6, 12),
  maximum_term = c(0, 18, 24),
  other_cols
)

How could I write the following more succinctly (maybe in a function and using purrr::map?)

library(dplyr)
library(stringr)

input <- input %>% 
  mutate(minimum_term = str_replace(
    minimum_term,
    'None',
    "0"
  )
  )

input <- input %>% 
  mutate(minimum_term = str_extract(minimum_term, '[0-9]{1,2}'))

output <- input %>% 
  mutate(minimum_term = as.numeric(minimum_term))
  1. The first operation is to take minimum_term from data frame input and replace all instances of "None" with "0".
  2. Second operation is then to extract the numbers.
  3. Third is to convert to a numeric.

I have more columns similar to minimum_term so I'm keen to put this into a pipeable function and use purrr, but unsure how to do this, my first try:

term_replacement <- function(df, x){
  df <- df %>% 
    mutate(x = str_replace(
       x,
      'None',
      "0"
    )
  )
  df <- df %>% 
    mutate(x = str_extract(x, '[0-9]{1,2}'))
  df <- df %>%
    mutate(x = as.numeric(x))
}
Three14
  • 43
  • 5
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Aug 10 '21 at 17:23
  • @MrFlick, thanks, have done just that. – Three14 Aug 10 '21 at 17:41

1 Answers1

1

If there are multiple columns, use across

library(stringr)
library(dplyr)
library(tidyr)
term_replacement <- function(df, cols){
      df %>%
           mutate(across(all_of(cols), ~ replace_na(readr::parse_number(.), 0)))
}

Call the function as (change the column names as needed)

term_replacement(input, c("minimum_term", "maximum_term"))

0utput

# A tibble: 3 x 2
  minimum_term maximum_term
         <dbl>        <dbl>
1            0            0
2            6           18
3           12           24
akrun
  • 874,273
  • 37
  • 540
  • 662