3

I use several mutate on the same column. How can we only use mutate once and without repeating the column name?

df <- data.frame(
  c1 = c("Élève", "Café", "Château", "Noël", "Crème")
)

df2 <- df %>% 
  mutate(c1 = trimws(c1)) %>%
  mutate(c1 = gsub("\\s+", " ", c1)) %>%
  mutate(c1 = gsub("\"", "", c1)) %>%
  mutate(c1 = iconv(toupper(c1), to = "ASCII//TRANSLIT"))
  
Mark
  • 7,785
  • 2
  • 14
  • 34
dia05
  • 57
  • 4

3 Answers3

7

Place the pipeline within the mutate like this:

df3 <- df %>%
  mutate(c1 = c1 %>%
    trimws %>%
    gsub("\\s+", " ", .) %>%
    gsub("\"", "", .) %>%
    toupper %>%
    iconv(to = "ASCII//TRANSLIT"))

identical(df2, df3)
## [1] TRUE
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
4

You can use pipes within mutate calls! Also, even if that weren't the case, columns you create in a mutate function call can be used later within the same function call. So you could keep on redefining c1 within one mutate call.

But anyway, this is how I would do it (using almost all stringr functions):

library(stringr)

df2 <- df |>
  mutate(c1 = str_squish(c1) |>
              str_remove_all("\"") |>
              str_to_upper() |>
              iconv(to = "ASCII//TRANSLIT"))
Mark
  • 7,785
  • 2
  • 14
  • 34
2

Not that you need another solution, but it could be handy to combine all your steps into a single function to tidy up your mutate call. You can combine a string of functions easily with purrr::compose to run them in the given order each time you need them.

Using G. Grothendieck's excellent code split into anonymous functions:

library(tidyverse)

df <- data.frame(
  c1 = c("Élève", "Café", "Château", "Noël", "Crème")
)

tidy_text <- compose(
  \(t) gsub("\\s+", " ", t),
  \(t) gsub("\"", "", t),
  toupper,
  \(t) iconv(t, to = "ASCII//TRANSLIT")
)

df %>% 
  mutate(c1 = tidy_text(c1))
#>        c1
#> 1   ELEVE
#> 2    CAFE
#> 3 CHATEAU
#> 4    NOEL
#> 5   CREME

Or using Mark's tidyverse code and purrr formula/function syntax:

tidy_text2 <- compose(
  str_squish,
  ~ str_remove_all(.x, "\""),
  str_to_upper,
  ~ iconv(.x, to = "ASCII//TRANSLIT")
)

df %>%
  mutate(c1 = tidy_text2(c1))
#>        c1
#> 1   ELEVE
#> 2    CAFE
#> 3 CHATEAU
#> 4    NOEL
#> 5   CREME

May not be necessary if you're only using it once of course! Just one way of having some bits tidier than others!

Andy Baxter
  • 5,833
  • 1
  • 8
  • 22