5

I have a data.table in R where I want to throw away the first and the last n rows. I want to to apply some filtering before and then truncate the results. I know I can do this this way:

example=data.table(row1=seq(1,1000,1),row2=seq(2, 3000,3))
e2=example[row1%%2==0]
e2[100:(nrow(e2)-100)]

Is there a possiblity of doing this in one line? I thought of something like:

example[row1%%2==0][100:-100]

This of course does not work, but is there a simpler solution which does not require a additional variable?

Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
theomega
  • 31,591
  • 21
  • 89
  • 127
  • 1
    Did you test the line `e2[100:length(e2)-100]`? I think you meant `nrow` not `length` (`length(DT)` is the number of columns), and need brackets too becuse `:` is higher precedence that `-`. I'll edit the question and answer ... – Matt Dowle Apr 13 '12 at 09:22
  • You are correct, my example didn't return what I intended. – theomega Apr 13 '12 at 09:40

2 Answers2

4
 example=data.table(row1=seq(1,1000,1),row2=seq(2, 3000,3))
 n = 5
 str(example[!rownames(example) %in% 
                 c( head(rownames(example), n), tail(rownames(example), n)), ])
Classes ‘data.table’ and 'data.frame':  990 obs. of  2 variables:
 $ row1: num  6 7 8 9 10 11 12 13 14 15 ...
 $ row2: num  17 20 23 26 29 32 35 38 41 44 ...
 - attr(*, ".internal.selfref")=<externalptr> 

Added a one-liner version with the selection criterion

str( 
     (res <- example[row1 %% 2 == 0])[ n:( nrow(res)-n ),  ] 
      )
Classes ‘data.table’ and 'data.frame':  491 obs. of  2 variables:
 $ row1: num  10 12 14 16 18 20 22 24 26 28 ...
 $ row2: num  29 35 41 47 53 59 65 71 77 83 ...
 - attr(*, ".internal.selfref")=<externalptr> 

And further added this version that does not use an intermediate named value

str(  
example[row1 %% 2 == 0][n:(sum( row1 %% 2==0)-n ),  ] 
   )
Classes ‘data.table’ and 'data.frame':  491 obs. of  2 variables:
 $ row1: num  10 12 14 16 18 20 22 24 26 28 ...
 $ row2: num  29 35 41 47 53 59 65 71 77 83 ...
 - attr(*, ".internal.selfref")=<externalptr> 
IRTFM
  • 258,963
  • 21
  • 364
  • 487
3

In this case you know the name of one column (row1) that exists, so using length(<any column>) returns the number of rows within the unnamed temporary data.table:

example=data.table(row1=seq(1,1000,1),row2=seq(2, 3000,3))

e2=example[row1%%2==0]
ans1 = e2[100:(nrow(e2)-100)]

ans2 = example[row1%%2==0][100:(length(row1)-100)]

identical(ans1,ans2)
[1] TRUE
Matt Dowle
  • 58,872
  • 22
  • 166
  • 224