2

I think dplyr is awesome. However I recently updated the package and it seems that summarise no longer aggregates by group for me. Something similar to the following code used to summarize by group before the upgrade:

iris_tdt <- tbl_dt(iris)
iris_tdt %.% group_by(Species) %.% summarise(avg_petal_width = mean(Petal.Width))

  avg_petal_width
1        1.199333

This used to output a table with Species and avg_petal_width. Now the Species column seems to be dropped and avg_petal_width is aggregated to a single value. group_by seems to be working, so I'm guessing this is a issue with summarise.

grp <- group_by(iris_tdt,Species)
groups(grp)

[[1]]
Species

Not even the example from the Vignette works correctly.

hflights_df <- tbl_df(hflights)
planes <- group_by(hflights_df, TailNum)
delay <- summarise(planes,
  dist = mean(Distance, na.rm = TRUE),
  delay = mean(ArrDelay, na.rm = TRUE))

delay
      dist    delay
1 787.7832 7.094334

Any advice would be greatly appreciated.

packageDescription("dplyr")$Version #--> 0.1.2
R.version.string #--> "R version 3.0.2 (2013-09-25)"
Ben Carlson
  • 1,053
  • 2
  • 10
  • 18
  • 1
    I can't reproduce this (with the same R and dplyr versions). Have you checked in a clean R session? – joran Mar 11 '14 at 14:21
  • Hi @joran, thanks for the comment. You are on the right track, when I started a clean R version the vignette code worked but I saw the error again when I ran my script. Turns out Vincent had the full solution. – Ben Carlson Mar 14 '14 at 00:47

1 Answers1

5

You may have another summarise function, probably from the plyr package.

# Works
library(dplyr)
iris_tdt <- tbl_dt(iris)
iris_tdt %.% 
  group_by(Species) %.% 
  summarise(avg_petal_width = mean(Petal.Width))

# No longer works...
library(plyr)
iris_tdt <- tbl_dt(iris)
iris_tdt %.% 
  group_by(Species) %.% 
  summarise(avg_petal_width = mean(Petal.Width))

If you really need both packages, you can try to load dplyr last, or prefix all the affected functions (summarise, mutate, etc.) with their namespace (dplyr::summarise, etc.)

Vincent Zoonekynd
  • 31,893
  • 5
  • 69
  • 78
  • That must have been it! I was using the expand.grid.df package in reshape, and it was loading plyr in the background. I turned that off and used an alternative expand.grid.df (found [here](http://stackoverflow.com/questions/11693599/alternative-to-expand-grid-for-data-frames) by YT. Works perfectly now. Thank you! – Ben Carlson Mar 14 '14 at 00:51