8

Possible Duplicate:
Can `ddply` (or similar) do a sliding window?

Is there a function like rollapply (standard R or within a CRAN package) that operates on a data.frame, but doesn't convert it to a matrix. rollapply can be used with a data.frame, but if the data.frame has mixed types then each window of data is converted to a character (matrix).

I prefer a function that supports width, na.pad, align, etc. just like rollapply

Example

Take any data.frame with mixed-types

test = data.frame( Name = c( "bob" , "jane" , "joe" ) , Points = c( 4 , 9 , 1 ) )

Lets say you want to roll with window size 2. The first iteration of FUN is called with a data.frame that only includes rows 1 and 2 of test.

So RollapplyThatRespectsDataFrame( ... , FUN = function( x ) { ... } ) upon the first iteration would set x = data.frame( Name = c( "bob" , "jane" ) , Points = c( 4 , 9 ) )

The second iteration is a data.frame with rows 2 and 3 of test.

Basically this new function does the same thing as rollapply, except it works properly with data.frames. It doesn't convert to matrix.

Community
  • 1
  • 1
Suraj
  • 35,905
  • 47
  • 139
  • 250

1 Answers1

9

Try this:

> library(zoo)
> DF <- data.frame(a = 1:10, b = 21:30, c = letters[1:10])
> replace(DF, 1:2, rollapply(DF[1:2], 3, sum, fill = NA))
    a  b c
1  NA NA a
2   6 66 b
3   9 69 c
4  12 72 d
5  15 75 e
6  18 78 f
7  21 81 g
8  24 84 h
9  27 87 i
10 NA NA j

Regarding the example that was added to the question after some discussion, such functionality can be layered on top of rollapply by applying it to the row indexes:

> lapply(as.data.frame(t(rollapply(1:nrow(test), 2, c))), function(ix)test[ix, ])
$V1
  Name Points
1  bob      4
2 jane      9

$V2
  Name Points
2 jane      9
3  joe      1

and here it is wrapped up a bit better:

rollapply.data.frame <- function(data, ..., fill = NULL, FUN, 
        simplify = function(x) do.call(rbind, x)) {
    fill0 <- if (!is.null(fill)) NA
    result <- lapply(
       as.data.frame(t(rollapply(1:nrow(data), ..., fill = fill0, FUN = c))), 
       function(ix) {if (all(is.na(ix))) fill else FUN(data[ix, ])}
    )
    simplify(result)
}

> rollapply(test, 2, FUN = identity, simplify = identity)
$V1
  Name Points
a  bob      4
b jane      9

$V2
  Name Points
b jane      9
c  joe      1

> rollapply(test, 2, FUN = identity, fill = NA, simplify = identity)
$V1
  Name Points
a  bob      4
b jane      9

$V2
  Name Points
b jane      9
c  joe      1

$V3
[1] NA
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Works for the example, but cannot be generalized to a data.frame with numeric interspersed with character. – Suraj Dec 07 '12 at 22:52
  • Just replace `1:2` with the indices or corresponding logical vector of the numeric columns: `ix <- sapply(DF, is.numeric); replace(DF, ix, rollapply(DF[ix], 3, sum, fill = NA))` – G. Grothendieck Dec 08 '12 at 00:49
  • That works! Its a good solution if FUN only needs numeric columns. The complete solution would provide a per-window data.frame so that all types are preserved. Its a valid scenario to want both per-window numeric and character data. – Suraj Dec 08 '12 at 14:08
  • Please give a reproducible example and solution. – G. Grothendieck Dec 08 '12 at 14:40
  • This is starting to look nice! FUN argument never gets used. Try `rollapply.data.frame( test , width = 2 , FUN = function( x ) { browser() } )` It should throw you into the browser, but it doesn't – Suraj Dec 08 '12 at 18:36
  • Good point. Its fixed now. – G. Grothendieck Dec 08 '12 at 22:39
  • I made a small edit - the function now combines sub-results into a single data.frame. Now try `rollapply.data.frame( test , width = 2 , FUN = function( x ) { return( data.frame( Sum = sum( x$Points ) ) ) } )` to return a rolling sum of points. Probably more robust if I used rbind.fill instead of rbind. Looks good? – Suraj Dec 10 '12 at 00:03
  • Sure. I moved it into a `simplify=` argument so that one can use `simplify = identity` to recover the original functionality. – G. Grothendieck Dec 10 '12 at 06:59