1
df = data.frame("a" = c(1, 2, 3, "q", "r"),
                "b" = c(5,6,7,0,"s"))
dfWANT = data.frame("a" = c(1, 2, 3, "NA", "NA"),
                    "b" = c(5,6,7,0,"NA"))
REP = c("q", "r", "s")

df[,][df[,] == REP] <- NA

I aim to specify a list(REP) that has the scores I want to set to NA. Original data is df and the one I want is dfWANT. REP is the vector of values I want to set to NA and the last line is my attempt that works only on col a.

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
bvowe
  • 3,004
  • 3
  • 16
  • 33

2 Answers2

3

You could use sapply to get a logical matrix of TRUE/FALSE value based on existence of REP value in it. We can then replace those TRUE values with NA.

df[sapply(df, `%in%`, REP)] <- NA

#     a    b
#1    1    5
#2    2    6
#3    3    7
#4 <NA>    0
#5 <NA> <NA>

In dplyr, we can use mutate_all

library(dplyr)
df %>% mutate_all(~replace(., . %in% REP, NA))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

We can convert the data.frame to matrix and do the %in% without looping in base R

df[`dim<-`(as.matrix(df) %in% REP, dim(df))] <- NA
df
#     a    b
#1    1    5
#2    2    6
#3    3    7
#4 <NA>    0
#5 <NA> <NA>

Or using the efficient data.table

library(data.table)
setDT(df)
for(j in seq_along(df)) set(df, i = which(df[[j]] %in% REP),  j=j, value = NA_character_)
akrun
  • 874,273
  • 37
  • 540
  • 662