1

Consider the following dplyr query

> mpg %>% group_by(class) %>% summarise(n())

The output is

# A tibble: 7 x 2
       class   n()
       <chr> <int>
1    2seater     5
2    compact    47
3    midsize    41
4    minivan    11
5     pickup    33
6 subcompact    35
7        suv    62

Now, I want to filter the result as follows:

> mpg %>% group_by(class) %>% filter(hwy==21) %>% summarise(n())

That is, I want to show the number of car classes having a highway mileage 21. Here is the result:

# A tibble: 2 x 2
       class   n()
       <chr> <int>
1    minivan     1
2 subcompact     1

This is the expected result, but what I want to see instead is all the classes again, and in case a class does not have a car with a highway mileage of 21, then n() should be reported as 0. How can I do this?

In other words, I want the dplyr query that shows the following output:

# A tibble: 7 x 2
       class   n()
       <chr> <int>
1    2seater     0
2    compact     0
3    midsize     0
4    minivan     1
5     pickup     0
6 subcompact     1
7        suv     0

where n() is the number car classes with a highway mileage of 21.

Is this possible?

1 Answers1

0

Try this

mpg %>% mutate(k=(hwy==21)) %>% group_by(class) %>%
   summarise(n=sum(k))
dimitris_ps
  • 5,849
  • 3
  • 29
  • 55