-1

all, my data looks something like this:

01/04/2006,900,11756

01/04/2006,901,7492

01/04/2006,902,4012

01/04/2006,903,3190 ....

The first column are Dates while the second column are time as 9:00. I want to get the daily sum of the third column.

Note that the date may not be continuous, and time may not be continuous.

To find all the dates in the data, I can use unique( ) function.

GeekCat
  • 309
  • 5
  • 18
  • Try `aggregate(value~Date, df, sum)` where `value` is the 3rd column, Date is the first column in your dataset "df" – akrun Feb 04 '15 at 09:14
  • @akrun, Your answer seems to work, but the returned results are in a strange order, what kind of order of data is aggregate() returning ? – GeekCat Feb 04 '15 at 09:45
  • You want the daily sum so this implies that we only need to take care of the date column. I wondered if something should be done with the time column or if it can be ignored? – Ruthger Righart Feb 04 '15 at 09:48
  • @akrun, thanks for what you pointed out. Would you mind to post your comments as answer? I will tick it. ^_^ – GeekCat Feb 04 '15 at 09:53
  • @GeekCat It is already a duplicate :-) – akrun Feb 04 '15 at 11:23

1 Answers1

1

The following uses the plyr package and gives the daily sum of the third column

library(plyr)

dd <- as.Date(c("01/04/2006", "01/04/2006", "01/04/2006", "01/04/2006"), format="%d/%m/%Y")
time <- as.character(c("9:00","9:01","9:02","9:03"))
val <- as.numeric(c(11756,7492,4012,3190))

dat <- data.frame(dd,time,val)

ddply(dat, .(dd), summarize, val = sum(val)) 
Ruthger Righart
  • 4,799
  • 2
  • 28
  • 33