14

Running R 4.0.2 and dplyr 1.0.2

I am trying to use n = n() in a summarize call on a srvyr object:

relduration_by_age_grp <- l %>% 
  filter(ongoing == 0 & ptype == i) %>% 
  select(ego.id, ptype, age.grp, ego.age.grp, duration, ego.wawt) %>%
  mutate(min.age.grp = ifelse(age.grp < ego.age.grp, 
                              age.grp,
                              ego.age.grp)) %>%
  srvyr::as_survey(ids=1, weights=ego.wawt) %>%
  group_by(ptype, min.age.grp) %>%
  summarize(n = n(),
            wtd.median = srvyr::survey_median(duration, na.rm=TRUE),
            wtd.mean = srvyr::survey_mean(duration, na.rm=TRUE), 
            median = srvyr::unweighted(median(duration, na.rm=TRUE)),
            mean = srvyr::unweighted(mean(duration, na.rm=TRUE)))

Based on other questions/answers, I've also tried using dplyr::summarize(n = dplyr::n(), but that results in the same error. Is the problem that it is not possible to use dplyr n() on a srvyr object? There does not appear to be a similar function in srvyr that can be used in a summarize call.

thanks!

Martina Morris
  • 155
  • 1
  • 1
  • 6
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Can you also share your `sessionInfo()` so we can see what packages are loaded? – MrFlick Aug 27 '20 at 01:05
  • Hi Martina! Long time no see. – Ben Bolker Aug 27 '20 at 02:23
  • Hey Ben -- nice to see you too :) – Martina Morris Aug 27 '20 at 22:09
  • I also found that sometimes loading plyr on top of dplyr can make n() or group_by not to work and yield that message. – Juan C Nov 12 '20 at 12:35

3 Answers3

21

The cause of this error is R’s confusion with which summarize function (dplyr vs. plyr) it should use.

Fortunately, we can tell R explicitly the package that we want to use by specifying the name and :: in front of the function.

so use dplyr::summarise()

Hammao
  • 801
  • 1
  • 9
  • 28
6

As far as I can tell, unlike dplyr (which accepts pretty much any summary function that returns a scalar, as well as its own specialized functions such as n()), srvyr::summarize gives you a limited choice of functions: from ?srvyr::summarize,

Summarise for ‘tbl_svy’ objects accepts several specialized functions. [emphasis added]

i.e., survey_mean, survey_total, survey_ratio, and a couple of others

Here's a hack that seems to work: calculate the sum (survey_total) of the inverse weights.

library(srvyr)
data(api, package="survey")
aa <- (apistrat 
      %>% as_survey_design(strata=stype, weights=pw) 
      %>% group_by(stype) 
)
aa %>% summarize(n=survey_total(1/pw))

This matches table(apistrat$stype)

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
1

Maybe is because you loaded a package, such as "operators", that masks "%>%" from dplyr package.