0

Okay, I have a bit of a noob question, so please excuse me. I have a data frame object as follows:

| order_id| department_id|department    |  n|
|--------:|-------------:|:-------------|--:|
|        1|             4|produce       |  4|
|        1|            15|canned goods  |  1|
|        1|            16|dairy eggs    |  3|
|       36|             4|produce       |  3|
|       36|             7|beverages     |  1|
|       36|            16|dairy eggs    |  3|
|       36|            20|deli          |  1|
|       38|             1|frozen        |  1|
|       38|             4|produce       |  6|
|       38|            13|pantry        |  1|
|       38|            19|snacks        |  1|
|       96|             1|frozen        |  2|
|       96|             4|produce       |  4|
|       96|            20|deli          |  1|

This is the code I've used to arrive at this object:

temp5 <- opt %>%
    left_join(products,by="product_id")%>%
    left_join(departments,by="department_id") %>%
    group_by(order_id,department_id,department) %>%
    tally() %>%
    group_by(department_id)

kable(head(temp5,14))

As you can see, the object contains, departments present in each Order_id. Now, what I want to do is, I want to count the number of departments for each order_id

i tried using the summarise() method in the dplyr package, but it throws the following error:

Error in summarise_impl(.data, dots) : Evaluation error: no applicable method for 'groups' applied to an object of class "factor".

It seems so simple, but cant fig out how to do it. Any help will be appreciated.

Edit: This is the code that I tried to run, post which I read about the count() function in the plyr package, i tried to use that as well, but that is of no use as it needs a data frame as input, whereas I only want to count the no of occurrences in the data frame

     temp5 <- opt %>%
     +     left_join(products,by="product_id")%>%
     +     left_join(departments,by="department_id") %>%
     +     group_by(order_id,department_id,department) %>%
     +     tally() %>%
     +     group_by(department_id) %>%
     +     summarise(count(department))

In the output, I need to know the average no. of departments ordered from in each order id, so i need something like this:

      Order_id | no. of departments
         1                3
         36               4
         38               4
         96               3

And then I should be able to plot using ggplot, no. of orders vs no. of departments in each order. Hope this is clear

0 Answers0