Count the number of times two values appear in a column based on the unique values of another column

Question

I have the dataframe below:

year<-c("2000","2000","2001","2002","2000")
gender<-c("M","F","M","F","M")
YG<-data.frame(year,gender)

In this dataframe I want to count the number of "M" and "F" for every year and then create a new dataframe like :

I tried something like:

library(dplyr)
ns<-YG %>%
  group_by(year) %>%
  count(YG$gender == "M")

Avoid using `$` within pipes `count(gender == "M")` – zx8754 Dec 11 '18 at 19:34 — zx8754, Dec 11 '18 at 19:34

tmfmnk · Answer 1 · 2018-12-11T21:11:09.450

2

A solution using reshape2:

dcast(YG, year~gender)

  year F M
1 2000 1 2
2 2001 0 1
3 2002 1 0

Or a different tidyverse solution:

YG %>%
 group_by(year) %>%
 summarise(M = length(gender[gender == "M"]),
           F = length(gender[gender == "F"]))

  year      M     F
  <fct> <int> <int>
1 2000      2     1
2 2001      1     0
3 2002      0     1

Or as proposed by @zx8754:

YG %>%
 group_by(year) %>%
 summarise(M = sum(gender == "M"),
           F = sum(gender == "F"))

edited Dec 11 '18 at 21:11

answered Dec 11 '18 at 19:41

tmfmnk

38,881
4
47
67

Maybe just M = sum(gender == “M”) ? – zx8754 Dec 11 '18 at 20:49
@zx8754 that is indeed very straightforward, added it to my post. Thanks. – tmfmnk Dec 11 '18 at 21:15

acylam · Accepted Answer · 2018-12-11T19:43:12.477

1

We can use count and spread to get the df format and use fill = 0 in spread to fill in the 0s:

library(tidyverse)
YG %>%
  group_by(year) %>%
  count(gender) %>%
  spread(gender, n, fill = 0)

Output:

# A tibble: 3 x 3
# Groups:   year [3]
  year      F     M
  <fct> <dbl> <dbl>
1 2000      1     2
2 2001      0     1
3 2002      1     0

edited Dec 11 '18 at 19:43

answered Dec 11 '18 at 19:28

acylam

18,231
5
36
45

2

No need for last mutate_all, just use `fill = 0` inside spread. – zx8754 Dec 11 '18 at 19:36
@zx8754 Thanks! I always forget about this option. – acylam Dec 11 '18 at 19:43

Count the number of times two values appear in a column based on the unique values of another column

2 Answers2