0

I have a dataset with 40 columns and 9000 rows, all of the columns contain at least one string "NA". I want to drop every row that has at least one "NA" but I need to change it to an actual NA value beforehand.

I cannot use the na.strings="" argument as I am getting my data using the opendatatoronto package, not read.csv.

I have also tried this code, which didn't work either. for(i in names(data)) (set(data, which(data[[i]] == "NA"), i, NA))

Phil
  • 7,287
  • 3
  • 36
  • 66
Ola
  • 17
  • 5

2 Answers2

1

dplyr::na_if() should do the trick:

df <- tibble( x = c('A', 'NA', 'C'), 
        y = c('D', 'E', 'NA'), 
        z = c('NA', 'NA', 'I' ))

na_if(df, 'NA')
tivd
  • 750
  • 3
  • 17
1

What about

dat[dat == 'NA'] <- NA
Anil
  • 1,097
  • 7
  • 20