1

The aim is to aggregate the observation by the number of rows.

To illustrate, example data look like:

structure(list(observation = c(1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 1)), class = "data.frame", row.names = c(NA, 
-20L), variable.labels = structure(character(0), .Names = character(0)), codepage = 65001L)

Visually, the above is:

╔═════════════╗
║ observation ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╚═════════════╝

The end goal is to aggregate by a designated number of rows (e.g., 10 in the example output below) in terms of the count of 1's and the average. The output would look like:

╔═══════╦══════╗
║ count ║ mean ║
╠═══════╬══════╣
║   3   ║  0.3 ║
╠═══════╬══════╣
║   1   ║  0.1 ║
╚═══════╩══════╝
user14250906
  • 197
  • 8

3 Answers3

2

A tidyverse solution. Create a grouping variable based on the mod 10 of the row_number:

library(tidyverse)

d %>%
    mutate(rn = cumsum(row_number() %% 10 == 1)) %>%
    group_by(rn) %>%
    summarise(count = sum(observation),
              mean = mean(observation))

     rn count  mean
  <int> <dbl> <dbl>
1     1     3   0.3
2     2     1   0.1
bouncyball
  • 10,631
  • 19
  • 31
2

Using data.table

library(data.table)
setDT(df1)[, .(count = sum(observation), mean = mean(observation)),
      .(grp = as.integer(gl(nrow(df1), 10, nrow(df1))))][, grp := NULL][]

-output

#   count mean
#1:     3  0.3
#2:     1  0.1
akrun
  • 874,273
  • 37
  • 540
  • 662
1

You can try the code below

do.call(
  rbind,
  tapply(
    df$observation,
    ceiling(seq(nrow(df)) / 10),
    function(x) data.frame(count = sum(x), mean = mean(x))
  )
)

which gives

  count mean
1     3  0.3
2     1  0.1
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81