0

I have a table with a list of people, each with a different ID. The IDs are replicated 12*n times and each row contains a value. For instance:

table

I want to calculate the sum of these values every 12 months,i.e the sum of values from row 2:13, 14:25, etc... and that for each person in my data table.

I have done this but it does not seem to work.

for (i in unique(new_table[,"ID"])){

  n<-data[n_pers,9]*12   # n differs for each person and is obtained from another table

  for (t in 0:n) {

  new_table$yearly<-sum(new_table[which(new_table$ID==i),][t*12+1:min(n,(t+1)*12+1) ,"Values"])

}}

No errors are generated. However, I only have NA in my yearly column. Why is that so? Any help would be very much appreciated.

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
Chu
  • 17
  • 6
  • 1
    Welcome to SO! Please, can you provide a [mcve] as well as the expected result? As it is currently written, it is difficult for me to understand what you are trying ro achieve. Thank you. – Uwe Sep 19 '18 at 16:15
  • 1
    Additional links for making this question reproducible: https://stackoverflow.com/questions/5963269 and https://stackoverflow.com/tags/r/info. – r2evans Sep 19 '18 at 16:29
  • do you want to start from row 2 or from row 1?? – Onyambu Sep 19 '18 at 16:48
  • 1
    create a grouping variable ie `rep(1:(nrow(df)%/%12),each=12,length = nrow(df))` then aggregate on it – Onyambu Sep 19 '18 at 16:51
  • you can also use rollapply(new_table[which(new_table$ID==i),], 12, sum, by = 12) – Shirin Yavari Sep 19 '18 at 19:50

2 Answers2

1

Here is what you need:

ID<-c(rep("a",36),rep("b",60));Val<-round(rnorm(96,15,4),digits=0)
df<-as.data.frame(cbind(ID,Val))
df$Val<-as.numeric(df$Val)
Yearly<-NULL
df$Yearly<-0
library("zoo")
for(i in unique(df$ID)){
  Yearly<-rep(rollapply(df[which(df$ID==i),]$Val, 12, sum, by = 12), 
        rep(12,nrow(df[which(df$ID==i),])%/%12))
  df[which(df$ID==i),]$Yearly<-Yearly
}

to test it :

sum(df[12,]$Val)
#[1] 102

head(df,12)
#   ID Val Yearly
#1   a   6    102
#2   a   9    102
#3   a  12    102
#4   a   5    102
#5   a  19    102
#6   a  12    102
#7   a  10    102
#8   a   4    102
#9   a   7    102
#10  a   4    102
#11  a   8    102
#12  a   6    102

Edited to answer the question asked in your comment(I was not quite clear what you need, so left you several options to choose from!):

set.seed(1)
ID<-c(rep("a",36),rep("b",60));Val<-round(rnorm(96,15,4),digits=0)
df<-as.data.frame(cbind(ID,Val))
df$Val<-as.numeric(df$Val)
Yearly<-NULL
df$Yearly<-0
library("zoo")
for(i in unique(df$ID)){
  Yearly<-rep(rollapply(df[which(df$ID==i),]$Val, 12, sum, by = 12), 
          rep(12,nrow(df[which(df$ID==i),])%/%12))
  df[which(df$ID==i),]$Yearly<-Yearly
}
library("dplyr")
df<-df %>%
group_by(ID) %>% 
mutate(TotalSum= sum(Val),cumulativeSum=cumsum(Val),dif=TotalSum-lag   (cumulativeSum))
# View(df)
df$dif_withReplacedNAs<-df$dif
df[which(is.na(df$dif)),]$dif_withReplacedNAs <- df[which(is.na(df$dif)),]$TotalSum
head(df,16)
# ID      Val Yearly TotalSum cumulativeSum   dif dif_withReplacedNAs
# <fct> <dbl>  <dbl>    <dbl>         <dbl> <dbl>               <dbl>
#1 a        3.    85.     300.            3.   NA                 300.
#2 a        7.    85.     300.           10.  297.                297.
#3 a        3.    85.     300.           13.  290.                290.
#4 a       12.    85.     300.           25.  287.                287.
#5 a        7.    85.     300.           32.  275.                275.
#6 a        3.    85.     300.           35.  268.                268.
#7 a        8.    85.     300.           43.  265.                265.
#8 a        9.    85.     300.           52.  257.                257.
#9 a        8.    85.     300.           60.  248.                248.
#10 a        5.    85.     300.           65.  240.                240.
#11 a       12.    85.     300.           77.  235.                235.
#12 a        8.    85.     300.           85.  223.                223.
#13 a        4.   111.     300.           89.  215.                215.
#14 a       16.   111.     300.          105.  211.                211.
#15 a       10.   111.     300.          115.  195.                195.
#16 a        6.   111.     300.          121.  185.                185.
Shirin Yavari
  • 626
  • 4
  • 6
  • @Chu great! can you please mark this as the correct answer then please thanks! – Shirin Yavari Sep 20 '18 at 16:50
  • Suppose I want to calculate the sum of values from i=1 to n for each ID, could you kindly tell me how to modify the code? – Chu Sep 24 '18 at 14:40
  • @Chu do you only want the sum of values by id something like a 235 b 402 ? if yes then the following line will give you what you want: aggregate(df$Val, by=list(df$ID), "sum") – Shirin Yavari Sep 24 '18 at 21:04
  • Not really. For instance, if we consider the values of head(df,12), for the first a, I want to calculate the sum of all the remaining values of a (from first value to last value). For the second a, i want to calculate the sum of second value of a to last value of a.. and so on.. – Chu Sep 26 '18 at 09:17
  • @Chu I edited the solution to answer your question so can you please kindly mark it up as the correct solution if it works. – Shirin Yavari Sep 27 '18 at 15:59
0

You can also use dplyr to get what you need. The trick here is to create your new grouping variable, year_index here. Or adjust it to whatever interval you need.

df <- data.frame(ID = c("a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "b", "b"), Val = c(13, 12, 11, 14, 15, 6, 13, 12, 1, 9, 10, 5, 1,2,3,4,5,6,7,8,9,10,11,12,1,2))

library(dplyr)

df <- df %>% 
  group_by(ID) %>%
  mutate(year_index = ((1:n() -1) %/% 12) + 1) %>%
  group_by(year_index) %>% 
  mutate(sum = sum(Val))

# A tibble: 26 x 4
# Groups:   month_index [2]
   ID      Val month_index   sum
   <fct> <dbl>       <dbl> <dbl>
 1 a        13           1   124
 2 a        12           1   124
 3 a        11           1   124
 4 a        14           1   124
 5 a        15           1   124
 6 a         6           1   124
 7 a        13           1   124
 8 a        12           1   124
 9 a         1           1   124
10 a         9           1   124
11 a        10           1   124
12 a         5           1   124
13 a         1           2    78
14 a         2           2    78
15 a         3           2    78
16 a         4           2    78
17 a         5           2    78
18 a         6           2    78
19 a         7           2    78
20 a         8           2    78
21 a         9           2    78
22 a        10           2    78
23 a        11           2    78
24 a        12           2    78
25 b         1           1   124
26 b         2           1   124
Anonymous coward
  • 2,061
  • 1
  • 16
  • 29