Replace NAs with zeros in a subset of columns in a data.table

Question

Given this data.table:

library(data.table)
aa <- data.table(a = c(1, 2, 3), b = c(1, 2, NA), c = c(NA, 2, 3))

What is the better way of replacing NAs in a subset of columns (e.g. only b), other than

cols = c("b")
aa[, (cols) := {dt <- .SD; dt[is.na(dt)] <- 0; dt}, .SDcols = cols]

I feel like my way is not very clean, there has to be a more readable way. Thanks!

[EDIT]

My first example was not very good, here's a better one:

library(data.table)
aa <- data.table(a = c(1, 2, 3), b = c(1, 2, NA), c = c(NA, 2, 3), d = c(1, NA, 3))

I need to replace NAs in an arbitrary set of columns, e.g. b and c. That means I cannot use i, because matrices are not allowed there.

Does it have to be `data.table`? Is `aa$b[is.na(aa$b)] <- 0` or similar good enough? — Phil, May 30 '17 at 15:45
@Phil, yes it has to be `data.table`. The columns vector is some arbitrary character vector. In the example I use `b`, but it could be any set of columns. — abudis, May 30 '17 at 15:46
this is readable but I don't know about efficiency... `apply(aa,c(1,2),function(x){if(is.na(x)) 0 else x})` — moodymudskipper, May 30 '17 at 16:33
oops sorry I misread you, you want to be able to select a subset — moodymudskipper, May 30 '17 at 16:34

score 2 · Answer 1 · edited Nov 20 '19 at 16:11

2

Probably this is cleaner for data.table aa[is.na(b), b := 0]

[Edit]

I would write like this but not sure this is particularly more readable than yours.

cols = c("b", "c")
aa[, (cols) := lapply(.SD, function(x){x[is.na(x)] <- 0; x}), .SDcols = cols]

[Edit]

If you want to apply this for a range of columns you could use a subset

cols <- colnames(subset(aa, select=b:c))
aa[, (cols) := lapply(.SD, function(x){x[is.na(x)] <- 0; x}), .SDcols = cols]

edited Nov 20 '19 at 16:11

Thomas

answered May 30 '17 at 15:47

amatsuo_net

Okay, I don't think I gave a very good example. Let me update the question. – abudis May 30 '17 at 15:48

1 Answers1