5

I previously used the mutate_all function in dplyr to replace values in my data frame. I am trying to update my code to be able to accommodate the new across function but I am unsure how to update it so that it can perform the replace function.

Below is an example dataset.

df$A <- c(10,0,0,0,0,0,12,12,0,14,-99,14,-99,-99,16,16)
df$B <- c(10,0,0,0,12,12,12,12,0,14,-99,14,16,16,16,16)
df$C <- c(10,12,14,16,10,12,14,16,10,12,14,16,10,12,14,16)

  A   B  C
 10  10 10
  0   0 12
  0   0 14
  0   0 16
  0  12 10
  0  12 12
 12  12 14
 12  12 16
  0   0 10
 14  14 12
-99 -99 14
 14  14 16
-99  16 10
-99  16 12
 16  16 14
 16  16 16

The code I was previously using to replace a certain value (in this case -99) is below, and this worked successfully.

df %>% mutate_all(funs(replace(., .== -99, "Removed")))

      A       B  C
     10      10 10
      0       0 12
      0       0 14
      0       0 16
      0      12 10
      0      12 12
     12      12 14
     12      12 16
      0       0 10
     14      14 12
Removed Removed 14
     14      14 16
Removed      16 10
Removed      16 12
     16      16 14
     16      16 16

Below is how I tried to implement the across function for everything (this replaced all cells in the data frame with my desired replacement value; not just the instances of -99).

df %>% mutate(across(everything(), replace, -99 , "Removed"))
chipsin
  • 643
  • 1
  • 4
  • 10

1 Answers1

25

Use :

library(dplyr)
df %>% mutate(across(everything(), ~replace(., . ==  -99 , "Removed")))

The .cols argument for across is by default everything() so this would work as well.

df %>% mutate(across(.fns = ~replace(., . ==  -99 , "Removed")))

However, the most simplest would be :

df[df == -99] <- 'Removed'
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thank you Ronak for these solutions. I was curious why in this instance ~ is needed while when implementing other functions using ~ results in the function not running. An example I can give is the following, this works fine: df %>% mutate(across(everything(), na_if, -99)) but this example does not: df %>% mutate(across(everything(), ~na_if, -99)) – chipsin Aug 21 '20 at 06:23
  • 2
    There are different way in which you can call a function `~` is used when you want to call a function using formula notation. `across(everything(), na_if, -99)` doesn't use that. If you want to use formula notation, use `across(everything(), ~na_if(., -99))` – Ronak Shah Aug 21 '20 at 06:29
  • With the dplyr way the result must be saved as a new dataframe. Is it possible to replace the values of the existing one? – gd047 Mar 19 '22 at 14:16
  • `df <- df %>% ....` or `library(magrittr) df %<>% ...` – TarJae Mar 19 '22 at 14:23
  • @TarJae the first one created a new dataframe with the same name. As for the second, what if df lives in a parent environment? Is there an operator like %<<>% ? – gd047 Mar 20 '22 at 06:42
  • I am asking this because i'v seen that the base r way is much faster than `df <- df %<>% ...` – gd047 Mar 20 '22 at 06:48