I'm new to R and trying to understand how dplyr works so I can apply it to a dataset that I have. I'm trying to work through this example with the starwars API: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html
I'm trying to group the starwars dataframe by species and sex, and then find the mean of each species and sex. The code is copied from the tutorial:
starwars %>%
group_by(species, sex) %>%
select(height, mass) %>%
summarise(
height = mean(height, na.rm = TRUE),
mass = mean(mass, na.rm = TRUE)
)
And I should be getting this output:
#> Adding missing grouping variables: `species`, `sex`
#> `summarise()` has grouped output by 'species'. You can override using the `.groups` argument.
#> # A tibble: 41 x 4
#> # Groups: species [38]
#> species sex height mass
#> <chr> <chr> <dbl> <dbl>
#> 1 Aleena male 79 15
#> 2 Besalisk male 198 102
#> 3 Cerean male 198 82
#> 4 Chagrian male 196 NaN
#> # … with 37 more rows
But instead I'm getting this:
#> Adding missing grouping variables: `species`, `sex`
#> height mass
#> 1 174.358 97.31186
Could someone help me understand why it's collapsing all species and sex together, and then taking the mean of height and mass, instead of maintaining the separate groups?
Thanks!