16

I have this code below. I'm trying to use quantiles and then subset by groups (years, of which there are two). I think I can do this with dplyr, but it is not working:

Claims6 %>% 
  group_by(year) %>% 
  summarise(ranker = quantile(Expense, prob = c(.10, .30, .50, .80)))
rcs
  • 67,191
  • 22
  • 172
  • 153
alboman
  • 161
  • 1
  • 1
  • 4
  • 1
    "is not working" is very vague. When you get an error, you should post the specific error message. Though it might not always be obvious, error messages are designed to be useful and helpful! – Gregor Thomas Jun 15 '16 at 21:24
  • 1
    Try posting a reproducible example, not necessarily all of your data but some of it. It's hard to tell what you have in Claims6 and simple things like different classes make a big difference. – Nate Jun 15 '16 at 21:25
  • This question is a little different (the original poster was a little closer to the right answer), but it might get you to where you want to be: http://stackoverflow.com/q/30225560/903061 – Gregor Thomas Jun 15 '16 at 21:26
  • 3
    You're returning four values for each group; `dplyr` is naturally better at cutting data down than expanding it. If you wrap `quantile` in `list`, you can expand with `tidyr::unnest` like so: `Claims6 %>% group_by(year) %>% summarise(ranker = list(quantile(Expense, prob= c(.10,.30,.50,.80)))) %>% unnest()`, or to add on probilities, something like `Claims6 %>% group_by(year) %>% summarise(nest_col = list(data.frame(ranker = quantile(Expense, prob= c(.10,.30,.50,.80))) %>% add_rownames('prob'))) %>% unnest()` – alistaire Jun 15 '16 at 21:48
  • You can do it in `dplyr` with `summarise` instead of `do`, but, [as shown here](http://stackoverflow.com/a/30489785/496488), you need to assign each quantile to a separate summary column. The `do` method in @M_Fidino's answer will be easier if you want to calculate several quantiles. – eipi10 Jun 15 '16 at 23:40

1 Answers1

28

You can use the do function for problems like this. I generated some data for you to test this out.

library(dplyr)
Claims6 <- data.frame(year = factor(rep(c(2015, 2016), each = 10)),
                  Expense = runif(20))

Claims6 %>% group_by(year) %>% 
  do(data.frame(t(quantile(.$Expense, probs = c(0.10, 0.30, 0.50, 0.80)))))


Source: local data frame [2 x 5]
Groups: year [2]

    year       X10.      X30.      X50.      X80.
  (fctr)      (dbl)     (dbl)     (dbl)     (dbl)
1   2015 0.06998258 0.2855598 0.5469119 0.9499181
2   2016 0.22983539 0.3691736 0.4754915 0.7058695
mfidino
  • 3,030
  • 1
  • 9
  • 13