2

QUESTION: Is there a way to reference the original dataset OR (preferably) the dataset from the chain, right before the group_by() at all?

nrow(mtcars)

32 (but we all knew that)

> mtcars %>% group_by(cyl) %>% summarise(count = n())
# A tibble: 3 x 2
    cyl count
  <dbl> <int>
1     4    11
2     6     7
3     8    14

Great.

mtcars %>% 
  group_by(cyl) %>% 
  summarise(count = n(), 
  prop = n()/SOMETHING)

I understand I could put nrow(mtcars) in there, but this is just a MRE. That's not an option in more complex chain of operations.


Edit: I oversimplified the MRE. I am aware of the "." but I actually wanted to be able to pass the interim tibble off to another function (within the call to summarise), so the assign solution below does exactly what I was after. Thanks.

nzcoops
  • 9,132
  • 8
  • 41
  • 52
  • I realize this is an old thread, but if you are still active, could you post your solution as an answer and then accept it? That would help others who are having the same problem. Also, I don't see the code using the assign function in your update. Thanks! – ESELIA Sep 06 '22 at 13:10

2 Answers2

1

We can use add_count to count the number and create a new column of the original data frame. If we need more complex operation, we can further use mutate from there.

library(dplyr)
library(tidyr)

mtcars %>%
  group_by(cyl) %>%
  add_count()
# # A tibble: 32 x 12
# # Groups:   cyl [3]
#    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb     n
#    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
# 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4     7
# 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4     7
# 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1    11
# 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1     7
# 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2    14
# 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1     7
# 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4    14
# 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2    11
# 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2    11
# 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4     7
# # ... with 22 more rows
www
  • 38,575
  • 12
  • 48
  • 84
1

You are after the ".":

  mtcars %>% 
  group_by(cyl) %>% 
  summarise(count = n(), 
            prop = n()/nrow(.)) %>%
  ungroup()
hello_friend
  • 5,682
  • 1
  • 11
  • 15
  • Thanks. Unfortunately that solves the MRE but not the broader application I was after - my fault for over simplifying. I actually wanted to pass off the dataset to another function, which can't be done with the ".". The intermediate assign gets me there. – nzcoops Nov 03 '19 at 04:02
  • @nzcoops No worries, sorry for the misunderstanding. – hello_friend Nov 03 '19 at 04:04