Replace characters from data frame with NA in Rbase

Question

I have a data frame like:

       Domain         Phylum          Class          Order
ID_1 Bacteria  Cyanobacteria Unclassified_c Unclassified_o
ID_2 Bacteria  Cyanobacteria Unclassified_c Unclassified_o
ID_3 Bacteria  Bacteroidetes Unclassified_c Unclassified_o
ID_4 Bacteria Proteobacteria Unclassified_c Unclassified_o
ID_5 Bacteria  Bacteroidetes Unclassified_c Unclassified_o

and I want to replace all the character Unclassified_c, Unclassified_o, elment_3, etc, for NA, so I had tried:

df[df == "Unclassified_c" ] <- NA

this work well if I use one by one value, but sometimes could be to many; So I will like to try something like a list of patterns and then use it, something like:

Remove_list <- ("Unclassified_c", "Unclassified_o", "element_3", "element_4", "element_x")

and then use the list to replace for NA:

df[ df == Remove_list ] <- NA

It change to NA some of the values but not all. I don't want to use stringr library, because it eliminate the rownames (ID_1 .. ID_x) and I need it, so I will like to try Rbase, any suggestion

Thanks so much !!!!

score 3 · Accepted Answer · answered May 12 '20 at 05:42

3

We can use sapply with %in% which returns logical matrix of whether a value is present in Remove_list or not. We can assign NA for TRUE values.

df[sapply(df, `%in%`, Remove_list)] <- NA

df
#       Domain         Phylum Class Order
#ID_1 Bacteria  Cyanobacteria  <NA>  <NA>
#ID_2 Bacteria  Cyanobacteria  <NA>  <NA>
#ID_3 Bacteria  Bacteroidetes  <NA>  <NA>
#ID_4 Bacteria Proteobacteria  <NA>  <NA>
#ID_5 Bacteria  Bacteroidetes  <NA>  <NA>

answered May 12 '20 at 05:42

Ronak Shah

377,200
20
156
213

1

Mine was 3 seconds before yours, but yours is more elegant. – r2evans May 12 '20 at 05:45
thanks so much, it works well !!!! – abraham May 12 '20 at 18:41

Replace characters from data frame with NA in Rbase

1 Answers1