-1

I have a data set named pnc_1

It has the date column with several repeating dates and the wid column with numeric values. I am trying to add up all the wid values for each identical date and then put the results into a vector.
This loop freezes R-Studio.

cumsum <- 0
for (i in 1:(nrow(pnc_1)-1)) {
  while (pnc_1$date[i] == pnc_1$date[i+1]) {
    cumsum <- (pnc_1$wid[i] + pnc_1$wid[i+1])
  }
}
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Try `tapply(v1, datevec, FUN = sum)` or if we need a vector with the same length as original vector, then `ave(v1, datevec, FUN = sum)` – akrun Aug 07 '17 at 11:40
  • wait. what is v1 in your code? –  Aug 07 '17 at 11:42
  • it could be initial vector – akrun Aug 07 '17 at 11:42
  • what is datevec? –  Aug 07 '17 at 11:43
  • 1
    @JasonJoonWooBaik What is iterated in the while-loop? And if there is something to iterate, why you overwrite the object `cumsum` each time? – jogo Aug 07 '17 at 11:44
  • @jogo hey, i think you are onto something. how would you fix my while-loop? –  Aug 07 '17 at 11:46
  • 1
    `pnc$cumwid <- ave(pnc$wid, pnc$date, FUN=cumsum)` Eventually sort your dataframe `pnc` by `$date` before the calcualtion. – jogo Aug 07 '17 at 11:49
  • @jogo Is there a function like ave() but gives you the total sum, not the average? –  Aug 07 '17 at 12:05
  • @JasonJoonWooBaik `ave()` is not only for averages. Please read the documentation and do `example(ave)`. Also `ave(pnc$wid, pnc$date, FUN=cumsum)` is not calculating averages. Please edit your question: give [mcve] and show your desired result! BTW: https://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group https://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega – jogo Aug 07 '17 at 12:10
  • @jogo thanks for the links. However, I do believe I am correct when I said `ave(pnc$wid, pnc$date, FUN=cumsum)` does calculate averages. –  Aug 07 '17 at 12:22

2 Answers2

1
library(dplyr)

# Creating a dataset
dates <- c(Sys.time() - (3600*24), Sys.time() - 2*(3600*24), Sys.time())
dates <- as.Date(dates)

dates <- rep(dates, each = 3)
wid <- c(1:9)

pnc_1 <- data.frame(date = dates, wid = wid)

# Using pipe operator and summarise to sum data by the grouping variable 'date'

sums <- pnc_1 %>%
           group_by(date) %>%
           summarise(wid.sum = sum(wid))

vector.of.sums <- c(sums$wid.sum)
Jason
  • 58
  • 3
1

You can use aggregate to perform you task efficiently. you don't need to go through all the hassle of for loop.

sum_wid = aggregate(wid ~ date, pnc_1, sum)$wid

Alternatively you can also use data.table package to get it done.

library("data.table")
pnc_1_dt = data.table(pnc_1)
sum_wid = pnc_1_dt[, sum(wid), by = dates]$V1
Srijan Sharma
  • 683
  • 1
  • 9
  • 19