1

This is kind of a follow-up to my question at Can I subset specific years and months directly from POSIXct datetimes?

I have a dataframe

test <- data.frame(seq(from = as.POSIXct("1983-03-09 01:00"), to = as.POSIXct("1985-01-08 00:00"), by = "hour"))
colnames(test) <- "DateTime"
test$Value<-sample(0:100,16104,rep=TRUE)

and I am subsetting particular years and months using

# Add year column
test$Year <- as.numeric(format(test$DateTime, "%Y"))

# Add month column
test$Month <- as.numeric(format(test$DateTime, "%m"))

# Subset specific year (1984 in this case)
sub1 = subset(test, Year!="1983" & Year!="1985")

# Subset specific months (April and May in this case)
sub2 = subset(test, Month=="4" | Month=="5")

From these subsets sub1 and sub2, I want to use the hourly data to calculate daily minimum, mean, and maximum from column Value.

I found a solution at Aggregating hourly data into daily aggregates

stat <- function(x) c(min = min(x), max = max(x), mean = mean(x))
sub1$Date <- as.Date(sub1$DateTime)
sub2$Date <- as.Date(sub2$DateTime)
aggregate(Value ~ Date, sub1, stat)
aggregate(Value ~ Date, sub2, stat)

This appears to give the minimum, mean, and maximum in columns (although I can't verify as I can't read the top in the R output window). I need to make these aggregate results into a dataframe containing Date, min, mean, and max. Does anyone know how I might do that? I've tried

sub1.sum <- aggregate(Value ~ Date, sub1, stat)

and

sub1.sum <- as.data.frame(aggregate(Value ~ Date, sub1, stat))

but that appears to just return a single value (i'm not sure if this is the min, mean, or max).

Community
  • 1
  • 1
Thomas
  • 2,484
  • 8
  • 30
  • 49
  • Ok when I run your code `aggregate(Value ~ Date, sub1, stat)` returns `Date Value.min Value.max Value.mean 1 1984-01-01 1.00000 76.00000 41.93750 . .......... ....... ........ ........` Is that not what you wanted. – CCurtis Apr 11 '14 at 05:12
  • Hello CCurtis, yes that is what I wanted, but I would like to store those results in a dataframe with columns for date, min, mean, and max. Do you know how I might do that? My attempts thus far (above) have failed. – Thomas Apr 11 '14 at 05:36

1 Answers1

1

cbind(sub1.sum[,-2, drop=FALSE], as.data.frame(sub1.sum$Value)) gives you what you want, I guess. The problem before was the the value column consisting of 3 dimensions (min, mean, max)...

Christian Borck
  • 1,812
  • 1
  • 13
  • 19
  • Nice. Ok trying to understand how you answer works. What does indexing with -2 do? I've never had a problem like this where R will show you data but not allowing you to put it in a table. – CCurtis Apr 11 '14 at 06:50
  • -2 means all columns but the second. You could also say `sub1.sum[,1, drop=FALSE]` to only keep the first column (Dates). `drop=FALSE` prevents from converting a single column dataframe into a vector. So what the code does in total is: pasting together (cbind is for column bind) the Dates column with the three columns in Value. – Christian Borck Apr 11 '14 at 07:04
  • Ok and calling `sub1.sum$Value` is just returning min max mean I'm assuming. Mainly hadn't seen negative values used to index before. Is a good trick to know. Also interesting to know that `aggregate` was adding three values into a single column but only returning 1 and not reporting any error. – CCurtis Apr 11 '14 at 16:20