2

Given the following example:

library(data.table)
mat <- data.table(x = c(1:10), y = c(11:20), z = c(21:30))

cut.head <- c(0, 2, 1) 
cut.tail <- c(3, 1, 2) 

cut.head represents the number of rows that each column will be NA from top.

cut.tail represents the number of rows that each column will be NA from last.

For example, if cut.head is used, 1st and 2nd rows of column y will be NAs, as well as the 1st column of z

I would like the return as follows:

     x  y  z
 1:  1 NA NA
 2:  2 NA 22
 3:  3 13 23
 4:  4 14 24
 5:  5 15 25
 6:  6 16 26
 7:  7 17 27
 8: NA 18 28
 9: NA 19 NA
10: NA NA NA

Thank you

Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
newbie
  • 917
  • 8
  • 21

1 Answers1

4

I'd just use a for loop with := (or set()) so it's fast and (fairly) easy to read.

> for (i in 1:3) mat[seq_len(cut.head[i]), (i):=NA]
> mat
     x  y  z
 1:  1 NA NA
 2:  2 NA 22
 3:  3 13 23
 4:  4 14 24
 5:  5 15 25
 6:  6 16 26
 7:  7 17 27
 8:  8 18 28
 9:  9 19 29
10: 10 20 30

Notice that the LHS of := accepts column numbers as well as names. As an aside, this is valid :

DT[, 2:=2]   # assign 2 to column 2

Wrapping the LHS of := with parenthesis, (i):=NA, tells it to use the variable's value rather than its name.

For the tail I first tried the following but .N isn't available in i. I've added that as a feature request, FR#724.
UPDATE: Now added to v1.9.3 on 11 Jul 2014

for (i in 1:3) mat[.N+1-seq_len(cut.tail[i]), (i):=NA]
# .N now works in i
> mat
     x  y  z
 1:  1 NA NA
 2:  2 NA 22
 3:  3 13 23
 4:  4 14 24
 5:  5 15 25
 6:  6 16 26
 7:  7 17 27
 8: NA 18 28
 9: NA 19 NA
10: NA NA NA
>

We no longer have to live with a repetition of the symbol mat :

> for (i in 1:3) mat[nrow(mat)+1-seq_len(cut.tail[i]), (i):=NA]
Matt Dowle
  • 58,872
  • 22
  • 166
  • 224