How to aggregate by the number of rows

Question

The aim is to aggregate the observation by the number of rows.

To illustrate, example data look like:

structure(list(observation = c(1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 1)), class = "data.frame", row.names = c(NA, 
-20L), variable.labels = structure(character(0), .Names = character(0)), codepage = 65001L)

Visually, the above is:

╔═════════════╗
║ observation ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╚═════════════╝

The end goal is to aggregate by a designated number of rows (e.g., 10 in the example output below) in terms of the count of 1's and the average. The output would look like:

╔═══════╦══════╗
║ count ║ mean ║
╠═══════╬══════╣
║   3   ║  0.3 ║
╠═══════╬══════╣
║   1   ║  0.1 ║
╚═══════╩══════╝

score 2 · Accepted Answer · answered Feb 09 '21 at 22:20

A tidyverse solution. Create a grouping variable based on the mod 10 of the row_number:

library(tidyverse)

d %>%
    mutate(rn = cumsum(row_number() %% 10 == 1)) %>%
    group_by(rn) %>%
    summarise(count = sum(observation),
              mean = mean(observation))

     rn count  mean
  <int> <dbl> <dbl>
1     1     3   0.3
2     2     1   0.1

score 2 · Answer 2 · answered Feb 09 '21 at 23:18

2

Using data.table

library(data.table)
setDT(df1)[, .(count = sum(observation), mean = mean(observation)),
      .(grp = as.integer(gl(nrow(df1), 10, nrow(df1))))][, grp := NULL][]

-output

#   count mean
#1:     3  0.3
#2:     1  0.1

answered Feb 09 '21 at 23:18

akrun

874,273
37
540
662

score 1 · Answer 3 · answered Feb 09 '21 at 22:17

1

You can try the code below

do.call(
  rbind,
  tapply(
    df$observation,
    ceiling(seq(nrow(df)) / 10),
    function(x) data.frame(count = sum(x), mean = mean(x))
  )
)

which gives

  count mean
1     3  0.3
2     1  0.1

answered Feb 09 '21 at 22:17

ThomasIsCoding

96,636
9
24
81

How to aggregate by the number of rows

3 Answers3