I have this dummy dataset:
abc <- data.table(a = c("NA", "bc", "x"), b = c(1, 2, 3), c = c("n", "NA", "NA"))
where I am trying to replace "NA" with standard NA; in place using data.table. I tried:
for(i in names(abc)) (abc[which(abc[[i]] == "NA"), i := NA])
for(i in names(abc)) (abc[which(abc[[i]] == "NA"), i := NA_character_])
for(i in names(abc)) (set(abc, which(abc[[i]] == "NA"), i, NA))
However still with this I get:
abc$a
"NA" "bc" "x"
What am I missing?
EDIT: I tried @frank answer in this question which makes use of type.convert()
. (Thanks frank; didn't know such obscure albeit useful function) In documentation of type.convert()
it is mentioned: "This is principally a helper function for read.table." so I wanted to test this thoroughly. This function comes with small side effect when you have a complete column filled with "NA" (NA string). In such case type.convert()
is converting column to logical. For such case abc
will be:
abc <- data.table(a = c("NA", "bc", "x"), b = c(1, 2, 3), c = c("n", "NA", "NA"), d = c("NA", "NA", "NA"))
EDIT2: To summerize code present in original question:
for(i in names(abc)) (set(abc, which(abc[[i]] == "NA"), i, NA))
works fine but only in current latest version of data.table
(> 1.11.4). So if one is facing this problem then its better to update data.table and use this code than type.convert()