0

My data set looks like this

df<-read.table(text="group, race
               1, 1
               1, 1
               1, 4
               2, 1
               2, 3
               2, 4
               3, 1
               2, 1
               1, 1", header=T, sep=",")

I would like to get the sum of the count of each race BY each group. I have tried using the aggregate function but to no avail

I hope to have output that looks something like this:

group   race   total
1       1      2
1       4      1
2       1      2
.
.
.
3       1      1
M--
  • 25,431
  • 8
  • 61
  • 93
crich
  • 99
  • 3
  • 8
  • Probably you need `df %>% count(group, race)` – akrun May 20 '19 at 21:36
  • 1
    What do you mean by sum of count? This looks like data where you would count values of each group and race, but I'm not sure what you would then sum. Your output doesn't quite match your data: group 1, race 1 occurs 3 times, not 2 – camille May 20 '19 at 21:43
  • On earlier version of @akrun comment: ```df %>% add_count(group, race)``` does the same thing that I have with `ave()`. – M-- May 20 '19 at 22:08
  • `as.data.frame(table(df$group, df$race))` – Cath May 21 '19 at 13:37
  • @Cath Maybe ```as.data.frame(table(group=df$group, race=df$race))``` or ```as.data.frame(table(df))``` to preserve the names. – M-- May 22 '19 at 18:28

3 Answers3

2

It can be done in base using aggregate():

with(df, aggregate(list(count=race), by=list(group=group,race=race), FUN=length))

#>   group race count
#> 1     1    1     3
#> 2     2    1     2
#> 3     3    1     1
#> 4     2    3     1
#> 5     1    4     1
#> 6     2    4     1

or using ave() (if you want to preserve your data structure and just add column total):

df$total <- ave(df$race, df[,c("group","race")], FUN=length)

#>   group race total
#> 1     1    1     3
#> 2     1    1     3
#> 3     1    4     1
#> 4     2    1     2
#> 5     2    3     1
#> 6     2    4     1
#> 7     3    1     1
#> 8     2    1     2
#> 9     1    1     3
M--
  • 25,431
  • 8
  • 61
  • 93
0
library(tidyverse)

df<-tribble(~"group", ~"race",
               1, 1,
               1, 1,
               1, 4,
               2, 1,
               2, 3,
               2, 4,
               3, 1,
               2, 1,
               1, 1)
df %>% 
    count(group, race)
#> # A tibble: 6 x 3
#>   group  race     n
#>   <dbl> <dbl> <int>
#> 1     1     1     3
#> 2     1     4     1
#> 3     2     1     2
#> 4     2     3     1
#> 5     2     4     1
#> 6     3     1     1

Created on 2019-05-20 by the reprex package (v0.2.1)

dylanjm
  • 2,011
  • 9
  • 21
0

Similar to dylanjm but slightly different logic:

df %>%
group_by(group, race) %>%
summarise(Total = length(race))

Yields the same result:

# A tibble: 6 x 3
# Groups:   group [?]
  group  race Total
  <int> <int> <int>
1     1     1     3
2     1     4     1
3     2     1     2
4     2     3     1
5     2     4     1
6     3     1     1
hank_044
  • 176
  • 6