0

I have this df

PoolQC          Fence           MiscFeature
<chr>           <chr>           <chhr>
<NOT AVAILABLE> <NOT AVAILABLE> <NOT AVAILABLE>     
<NOT AVAILABLE> <NOT AVAILABLE> <NOT AVAILABLE>     
<NOT AVAILABLE> <NOT AVAILABLE> <NOT AVAILABLE>     
<NOT AVAILABLE> <NOT AVAILABLE> <NOT AVAILABLE>     
<NOT AVAILABLE> <NOT AVAILABLE> <NOT AVAILABLE>     
<NOT AVAILABLE> MnPrv           Shed        
<NOT AVAILABLE> <NOT AVAILABLE> <NOT AVAILABLE>     
<NOT AVAILABLE> <NOT AVAILABLE> Shed        
<NOT AVAILABLE> <NOT AVAILABLE> <NOT AVAILABLE>     
<NOT AVAILABLE> <NOT AVAILABLE> <NOT AVAILABLE>

how do I convert this all NOT AVAILABLE to NA, so that if I run this code

df %>% 
  is.na() %>% 
  colSums() %>% 
  sort(decreasing = TRUE)

it can detect the NA value

or, can I convert it by csv files reading?

df = read.csv("C:/Users/x.csv", sep = ";")
zizamuft
  • 93
  • 5
  • consider checking my answer again, I've updated it to account for your question on reading NA strings during `read.csv`. Thanks – luizbarcelos Aug 07 '21 at 01:54
  • 1
    Does this answer your question? [Replacing character values with NA in a data frame](https://stackoverflow.com/questions/3357743/replacing-character-values-with-na-in-a-data-frame) – luizbarcelos Aug 07 '21 at 02:10

1 Answers1

0

Say I have the following data frame:

df <- data.frame(foo=c("<NOT AVAILABLE>", 2), bar=c(3, "<NOT AVAILABLE>"))

enter image description here

Replacing all <NOT AVAILABLE> occurrences with NA:

df[df == "<NOT AVAILABLE>"] <- NA

Then:

enter image description here

For your second question, you can set specific strings to be interpreted as NA during read.csv. Example:

result = read.csv(file, na.strings = "<NOT AVAILABLE>")
luizbarcelos
  • 686
  • 5
  • 17