-1

I have this data and I want sum in groups of three the rows, 1-3,4-6,7-9,10-12. In my data frame I have 48 rows and 795 variables. Could you help me?

My data frame:

       X1       X2      X3      X4      X5      X6       X7     X8      X9     
1     0.00     0.00  136.29    0.00   60.52    0.00     0.00   0.00    0.00
2     0.00     0.00 4658.69    0.00    0.00 1749.50     0.00   0.00    0.00
3     0.00     0.00    0.00    0.00    0.00  125.86     0.00   0.00    0.00
4     0.00     0.00  119.34    0.00    0.00    0.00     0.00   0.00    0.00
5     0.00     0.00 4674.16 2107.55    0.00    0.00     0.00   0.00    0.00
6     0.00     0.00    0.00    0.00    0.00 5689.40     0.00   0.00    0.00
7  4270.87     0.00    0.00    0.00    0.00 3275.74     0.00   0.00    0.00
8     0.00   455.04    0.00    0.00    0.00 1296.30     0.00   0.00    0.00
9     0.00     0.00    0.00    0.00    0.00 9887.52     0.00   0.00    0.00
10    0.00     0.00    0.00    0.00    0.00    0.00     0.00   0.00    0.00
11    0.00     0.00    0.00    0.00 2169.64    0.00     0.00   0.00  699.93
12    0.00 12524.50    0.00    0.00    0.00    0.00     0.00   0.00    0.00

This is what I want:

       X1       X2      X3      X4      X5      X6       X7     X8      X9
1     0.00     0.00  ......
2     0.00     0.00  ......
3  4270.87   455.04  ......
4     0.00 12524.50  ......
Joan Triay
  • 1,518
  • 6
  • 20
  • 35

4 Answers4

4

Base R solution using filter (that's stats::filter - which dplyr bulldozes if already loaded - beware):

data.frame(lapply(df, function(x) filter(x, c(1,1,1), sides=1)[seq(3, nrow(df), 3)] ))
#       X1       X2      X3      X4      X5       X6 X7 X8     X9
#1    0.00     0.00 4794.98    0.00   60.52  1875.36  0  0   0.00
#2    0.00     0.00 4793.50 2107.55    0.00  5689.40  0  0   0.00
#3 4270.87   455.04    0.00    0.00    0.00 14459.56  0  0   0.00
#4    0.00 12524.50    0.00    0.00 2169.64     0.00  0  0 699.93

As @alexis_laz notes above ?rowsum is probably preferable, as it was explicitly defined for this purpose, in the form:

rowsum(data, appropriate_grouping_vector)

So, something like:

rowsum(dat, (1:nrow(dat) - 1) %/% 3)
#       X1       X2      X3      X4      X5       X6 X7 X8     X9
#0    0.00     0.00 4794.98    0.00   60.52  1875.36  0  0   0.00
#1    0.00     0.00 4793.50 2107.55    0.00  5689.40  0  0   0.00
#2 4270.87   455.04    0.00    0.00    0.00 14459.56  0  0   0.00
#3    0.00 12524.50    0.00    0.00 2169.64     0.00  0  0 699.93
thelatemail
  • 91,185
  • 12
  • 128
  • 188
1

The code below is a dplyr solution modified from its source (Summing columns on every nth row of a data frame in R).

It solves the problem by creating an index variable with which to group rows, and then takes rowsums of those subset rows. n can take any value, provided that nrow(df) is divisible by n.

df <- data.frame(runif(30), runif(30), runif(30))
library(plyr); library(dplyr)

n <- 3

df %>%
  group_by(indx = gl(ceiling(nrow(df)/n), n, nrow(df))) %>%
  summarise_each(funs(sum))
Community
  • 1
  • 1
Tad Dallas
  • 1,179
  • 5
  • 13
0

Split data on n rows, then sum:

# dummy data
df1 <- mtcars[1:12, 1:6]

# split sum combine 
t(sapply(split(df1, rep(1:4, each = 3)), colSums))
zx8754
  • 52,746
  • 12
  • 114
  • 209
0

This will also work:

df$group <- ceiling((1:nrow(df))/3)
aggregate(. ~ group, data = df, sum)[-1]

       X1       X2      X3      X4      X5       X6 X7 X8     X9
1    0.00     0.00 4794.98    0.00   60.52  1875.36  0  0   0.00
2    0.00     0.00 4793.50 2107.55    0.00  5689.40  0  0   0.00
3 4270.87   455.04    0.00    0.00    0.00 14459.56  0  0   0.00
4    0.00 12524.50    0.00    0.00 2169.64     0.00  0  0 699.93
Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63