1

I have vector from which I've created subset using this command:

sub.as = as[seq(70,100)]

The subset is following:

> is.vector(sub.as)
[1] TRUE

> str(sub.as)
 num [1:31] 1 0.75 0.9 0.475 0.925 0.975 1 1 0.525 1 ...

> sub.as
 [1] 1.000 0.750 0.900 0.475 0.925 0.975 1.000 1.000 0.525 1.000 0.200 0.200
[13] 0.200 0.200 0.150 0.150 0.150 0.150 0.150 0.450 0.875 0.175 0.150 0.150
[25] 0.150 0.100 0.100 0.100 0.100 0.350 1.000

I've applied rollapply to this vector following way:

> sub.as.avg.1 = rollapply(sub.as, width = 1,  by = 1,  FUN = mean, align = "left")

Based on this answer I've compared both vectors. Length of the outputs are the same:

> length(sub.as)
[1] 31

> length(sub.as.avg.1)
[1] 31

Values on those indexes should be different:

> which(sub.as != sub.as.avg.1)
 [1] 11 12 13 14 15 16 17 18 19 20 22 23 24 25 26 27 28 29

Values that should be different (but as you can see they aren't):

> sub.as[which(sub.as != sub.as.avg.1)]
 [1] 0.200 0.200 0.200 0.200 0.150 0.150 0.150 0.150 0.150 0.450 0.175 0.150
[13] 0.150 0.150 0.100 0.100 0.100 0.100

> sub.as.avg.1[which(sub.as != sub.as.avg.1)]
 [1] 0.200 0.200 0.200 0.200 0.150 0.150 0.150 0.150 0.150 0.450 0.175 0.150
[13] 0.150 0.150 0.100 0.100 0.100 0.100

Questions:

  1. is vec same as rollapply(vec, width = 1, by = 1, FUN = mean, align = "left")?
  2. Why which shows that there are differences between vectors?
Community
  • 1
  • 1
Wakan Tanka
  • 7,542
  • 16
  • 69
  • 122

1 Answers1

2

Note the difference between the two rollapply codes below. The first uses mean and the second uses (mean). If rollapply senses that the code is trying to take means then it uses a faster algorithm and that involves computation which can result in small numeric differences. [In an offline discussion Achim pointed out that this optimization is faster even with width = 1.) On the other hand, if (mean) is used then that defeats its attempt to recognize that means are desired and the optimization is not applied and so there are no numeric differences (although it will be slower).

library(zoo)

sub.as <- c(1, 0.75, 0.9, 0.475, 0.925, 0.975, 1, 1, 0.525, 1, 0.2, 0.2, 
 0.2, 0.2, 0.15, 0.15, 0.15, 0.15, 0.15, 0.45, 0.875, 0.175, 0.15, 
 0.15, 0.15, 0.1, 0.1, 0.1, 0.1, 0.35, 1)

First

# 1
r1 <- rollapply(sub.as, width = 1,  by = 1, FUN = mean, align = "left")
sub.as - r1
##  [1]  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00
##  [6]  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00
## [11]  5.551115e-17  5.551115e-17  5.551115e-17  5.551115e-17  5.551115e-17
## [16]  5.551115e-17  5.551115e-17  5.551115e-17  5.551115e-17  5.551115e-17
## [21]  0.000000e+00 -2.775558e-17 -2.775558e-17 -2.775558e-17 -2.775558e-17
## [26] -2.775558e-17 -2.775558e-17 -2.775558e-17 -2.775558e-17  0.000000e+00
## [31]  0.000000e+00

Second

# 2
r2 <- rollapply(sub.as, width = 1, by = 1, FUN = (mean), align = "left")
identical(sub.as, r2)
## [1] TRUE

Updated: Have revised answer to correspond to revision in question.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • The question is why `mean` over one number (`width = 1, by = 1`) should give another value than it was before. This does not make sense. – Wakan Tanka Feb 02 '16 at 23:45