0

Given this data.table:

library(data.table)
aa <- data.table(a = c(1, 2, 3), b = c(1, 2, NA), c = c(NA, 2, 3))

What is the better way of replacing NAs in a subset of columns (e.g. only b), other than

cols = c("b")
aa[, (cols) := {dt <- .SD; dt[is.na(dt)] <- 0; dt}, .SDcols = cols]

I feel like my way is not very clean, there has to be a more readable way. Thanks!

[EDIT]

My first example was not very good, here's a better one:

library(data.table)
aa <- data.table(a = c(1, 2, 3), b = c(1, 2, NA), c = c(NA, 2, 3), d = c(1, NA, 3))

I need to replace NAs in an arbitrary set of columns, e.g. b and c. That means I cannot use i, because matrices are not allowed there.

abudis
  • 2,841
  • 6
  • 32
  • 41

1 Answers1

2

Probably this is cleaner for data.table aa[is.na(b), b := 0]

[Edit]

I would write like this but not sure this is particularly more readable than yours.

cols = c("b", "c")
aa[, (cols) := lapply(.SD, function(x){x[is.na(x)] <- 0; x}), .SDcols = cols]

[Edit]

If you want to apply this for a range of columns you could use a subset

cols <- colnames(subset(aa, select=b:c))
aa[, (cols) := lapply(.SD, function(x){x[is.na(x)] <- 0; x}), .SDcols = cols]
Thomas
  • 1,252
  • 6
  • 24
amatsuo_net
  • 2,409
  • 11
  • 20