3

I would like to apply a mutate function to multiple columns at once. The columns start with certain string of characters and that is how they should be identified. I would like also to know the solution how to apply it to columns marked by indexing ie. data_fake[3:4]. The objective is to remove all the non-numeric characters and convert values to numeric. Can't make it work sadly. Desired result is given at the end of the code. Thanks a lot.

data_fake <- data.frame(c("1","2","NA","3,","2","1 only"),c(1,2,3,4,5,6),
                        c("23","3 bundles","4","5","NA","1"), c("3","5 packs","6","78","7","8"))
colnames(data_fake) <- c("AB foo ab", "AB foo bc", "CD foo ab","CD foo bc")

data_fake <- as_tibble(data_fake)

data_fake %>%
        select(starts_with("CD foo")) %>% 
        mutate(as.numeric(gsub("[^0-9]", "")))

data_fake_results <- as_tibble(data.frame(c("1","2","NA","3,","2","1 only"),c(1,2,3,4,5,6),
                        c(23,3,4,5,NA,1), c(3,5,6,78,7,8)))
camille
  • 16,432
  • 18
  • 38
  • 60
MIH
  • 1,083
  • 3
  • 14
  • 26
  • See `?mutate_if`. And you need to give `gsub` three arguments, the `pattern`, the `replacement`, and `x` the string to look in. – Gregor Thomas Oct 01 '18 at 16:36
  • 1
    Possible duplicate of [dplyr change many data types](https://stackoverflow.com/questions/27668266/dplyr-change-many-data-types) – camille Oct 01 '18 at 16:51
  • At the post that I marked as a duplicate, some of the answers are outdated, but [this one](https://stackoverflow.com/a/38428978/5325862) uses current `dplyr` selectors – camille Oct 01 '18 at 16:52

1 Answers1

5

We can use mutate_at

library(tidyverse)
data_fake %>%
    mutate_at(vars(3:4), funs(as.numeric(str_remove(., "\\s+[a-z]+"))))

Or use parse_number

data_fake %>%
     mutate_at(3:4, parse_number)

If we want to match the column names in mutate_at

data_fake %>% 
    mutate_at(vars(starts_with("CD")), parse_number)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Fantastic! And how to do exactly the same thing but choosing columns by the title (here they should start with "CD foo")? – MIH Oct 01 '18 at 16:44
  • 1
    @MIH `mutate_at(vars(starts_with("CD"))` – akrun Oct 01 '18 at 16:45
  • 1
    @MIH You can find all the functions used to specify columns in `?select`. The scoped (`*_at`) versions of `mutate`, `transmute`, `rename`, etc use the same specifying functions as `select` – divibisan Oct 01 '18 at 16:50