R Count number of times a level occurs in n rows

Question

I have, for example, a vector with 1000 obs and 3 levels (A, B, C). I want to count how many times level A occurs for every 5 rows and produce another vector of the count values, ie with 200obs. Is anyone able to help? I've found how to count based on another variable but not number of rows. Thank you!

df <- data.frame(test=factor(sample(c("A","B", "C" ),1000,replace=TRUE)))
head(df, 10)
   test
1     A
2     A
3     B
4     C
5     B
6     A
7     C
8     B
9     C
10    C

Perhaps `lapply(split(df$test, rep(1:200, each = 5)), table)`? — talat, Apr 27 '16 at 12:44
Possible duplicate of [R - how to count how many values per level in a given factor?](http://stackoverflow.com/questions/26114525/r-how-to-count-how-many-values-per-level-in-a-given-factor) — , Apr 27 '16 at 13:49

score 4 · Accepted Answer · answered Apr 27 '16 at 13:08

Here are a couple of options you might find useful:

a) count all entries per 5 rows and return a list:

head(lapply(split(df$test, rep(1:200, each = 5)), table), 2)
# $`1`      # <- result for rows 1:5
# 
# A B C 
# 1 0 4 
# 
# $`2`      # <- result for rows 6:10
# 
# A B C 
# 3 0 2

b) count all entries per 5 rows and return a matrix:

head(t(sapply(split(df$test, rep(1:200, each = 5)), table)), 2)
#   A B C
# 1 1 0 4
# 2 3 0 2

c) count number of As per 5 rows and return a list:

head(lapply(split(df$test == "A", rep(1:200, each = 5)), sum), 2)
# $`1`
# [1] 1
# 
# $`2`
# [1] 3

d) count number of As per 5 rows and return a vector:

head(sapply(split(df$test == "A", rep(1:200, each = 5)), sum), 2)
#1 2 
#1 3

Each of the results will be 200 entries long / have 200 rows.

Instead of `rep(1:200, each = 5)` you could also use something like `((seq_len(nrow(df)) -1) %/% 5) +1` — talat, Apr 27 '16 at 13:15
An alternative to `split`ting could be `table(rep(seq_len(nrow(df) / 5), each = 5), df$test)` — alexis_laz, Apr 27 '16 at 13:44

score 2 · Answer 2 · answered Apr 27 '16 at 13:10

2

Here is a solution with dplyr and tidyr

library(dplyr)
library(tidyr)
df %>%
  mutate(Set = (seq_along(test) - 1) %/% 5) %>%
  group_by(Set, test) %>%
  summarise(N = n()) %>%
  spread(key = test, value = N, fill = 0)

answered Apr 27 '16 at 13:10

Thierry

18,049
5
48
66

score 1 · Answer 3 · answered Apr 27 '16 at 12:48

1

We can use data.table

library(data.table)
setDT(df)[, .N , .(grp= gl(nrow(df), 5, nrow(df)), test)]

answered Apr 27 '16 at 12:48

akrun

874,273
37
540
662

score 0 · Answer 4 · answered Apr 27 '16 at 13:16

If you prefer dplyr, you could use

  c1 <- df %>%
  mutate(group = rep(paste0("G", seq(1, 200)), each = 5)) %>%
  # count each level
  count(group, test)

Note that this method doesn't include levels with no values for a certain group (i.e. no 0 values)

R Count number of times a level occurs in n rows

4 Answers4