2

I am confused by the following data.table behavior.

Set up reproducible example

iris <- data.table(iris)[1:10]
iris[,row.ord:=.I]

Why does this:

iris[,Val1:=Sepal.Length[row.ord]+Sepal.Length[row.ord+1]]

give a different result compared to this:

iris[,Val2:=sum(Sepal.Length[row.ord:(row.ord+1)])]
#Warning messages:
#1: In row.ord:(row.ord + 1) :
#  numerical expression has 10 elements: only the first used
#2: In row.ord:(row.ord + 1) :
#  numerical expression has 10 elements: only the first used

Results

    Sepal.Length Sepal.Width Petal.Length Petal.Width Species row.ord Val Val1 Val2
 1:          5.1         3.5          1.4         0.2  setosa       1 5.1 10.0   10
 2:          4.9         3.0          1.4         0.2  setosa       2 4.9  9.6   10
 3:          4.7         3.2          1.3         0.2  setosa       3 5.1  9.3   10
 4:          4.6         3.1          1.5         0.2  setosa       4 4.9  9.6   10
 5:          5.0         3.6          1.4         0.2  setosa       5 5.1 10.4   10
 6:          5.4         3.9          1.7         0.4  setosa       6 4.9 10.0   10
 7:          4.6         3.4          1.4         0.3  setosa       7 5.1  9.6   10
 8:          5.0         3.4          1.5         0.2  setosa       8 4.9  9.4   10
 9:          4.4         2.9          1.4         0.2  setosa       9 5.1  9.3   10
10:          4.9         3.1          1.5         0.1  setosa      10 4.9   NA   10
Mike.Gahan
  • 4,565
  • 23
  • 39
  • 3
    `sum` is not vectorized so it will try to return a single value. And `seq()` (which is the underlying function for `:`) is always getting 1 and 2 as its arguments. .... as the warning tells you. – IRTFM Jun 05 '14 at 22:05

2 Answers2

3

As commented The main reason is that : sum, is not vectorized. One classical way to vectorize it is to use mapply:

iris[, Val2:=mapply(sum,Sepal.Length[row.ord],Sepal.Length[row.ord+1])]

Now Val1 and Val2 are equal:

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species row.ord Val1 Val2
 1:          5.1         3.5          1.4         0.2  setosa       1 10.0 10.0
 2:          4.9         3.0          1.4         0.2  setosa       2  9.6  9.6
 3:          4.7         3.2          1.3         0.2  setosa       3  9.3  9.3
 4:          4.6         3.1          1.5         0.2  setosa       4  9.6  9.6
 5:          5.0         3.6          1.4         0.2  setosa       5 10.4 10.4
 6:          5.4         3.9          1.7         0.4  setosa       6 10.0 10.0
 7:          4.6         3.4          1.4         0.3  setosa       7  9.6  9.6
 8:          5.0         3.4          1.5         0.2  setosa       8  9.4  9.4
 9:          4.4         2.9          1.4         0.2  setosa       9  9.3  9.3
10:          4.9         3.1          1.5         0.1  setosa      10   NA   NA

PS:This question may help you to better understand the "vectorized" aspect.

EDIT after OP comment:

looks like the OP is looking for a forward rolling sum.

library(zoo)
iris[,valr := rollapplyr(Sepal.Length,5,sum)]
Community
  • 1
  • 1
agstudy
  • 119,832
  • 17
  • 199
  • 261
0

It looks like you're using the sum to include the element count:

iris[,Val2:=sum(Sepal.Length[row.ord:(row.ord+1)])]

You need to extract the elements, not just get the count.

Arun
  • 116,683
  • 26
  • 284
  • 387
DMC
  • 133
  • 1
  • 8