0

I have the dataframe below:

year<-c("2000","2000","2001","2002","2000")
gender<-c("M","F","M","F","M")
YG<-data.frame(year,gender)

In this dataframe I want to count the number of "M" and "F" for every year and then create a new dataframe like :

year M F
1 2000 2 1
2 2001 1 0
3 2002 0 1

I tried something like:

library(dplyr)
ns<-YG %>%
  group_by(year) %>%
  count(YG$gender == "M")
zx8754
  • 52,746
  • 12
  • 114
  • 209
firmo23
  • 7,490
  • 2
  • 38
  • 114

2 Answers2

2

A solution using reshape2:

dcast(YG, year~gender)

  year F M
1 2000 1 2
2 2001 0 1
3 2002 1 0

Or a different tidyverse solution:

YG %>%
 group_by(year) %>%
 summarise(M = length(gender[gender == "M"]),
           F = length(gender[gender == "F"]))

  year      M     F
  <fct> <int> <int>
1 2000      2     1
2 2001      1     0
3 2002      0     1

Or as proposed by @zx8754:

YG %>%
 group_by(year) %>%
 summarise(M = sum(gender == "M"),
           F = sum(gender == "F"))
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
1

We can use count and spread to get the df format and use fill = 0 in spread to fill in the 0s:

library(tidyverse)
YG %>%
  group_by(year) %>%
  count(gender) %>%
  spread(gender, n, fill = 0)

Output:

# A tibble: 3 x 3
# Groups:   year [3]
  year      F     M
  <fct> <dbl> <dbl>
1 2000      1     2
2 2001      0     1
3 2002      1     0
acylam
  • 18,231
  • 5
  • 36
  • 45