1

I have a data frame which consists of date and temperature of 34 different systems each system in different column. I need to calculate every systems average hourly temperature. I use this code to calculate average for 1 system. But if I want to calculate average for other 33 systems, I have to repeat code again, and again. Is there a better way to find hourly average in all columns at once ?

dat$ut_ms <- dat$ut_ms/1000
dat[ ,1]<- as.POSIXct(dat[,1], origin="1970-01-01")
dat$ut_ms <- strptime(dat$ut_ms, "%Y-%m-%d %H:%M")
dat$ut_ms <- cut(dat[enter image description here][1]$ut_ms, breaks = 'hour')
meanNPWD2401<- aggregate(NPWD2401 ~ ut_ms, dat, mean)

I added a picture of the data. For better understing of what I want.

Extria
  • 363
  • 6
  • 18
  • Welcome to SO. Can you edit your question and include the result of `dput(head(dat))` so we know what your data currently looks like. See this post on creating a reproducible example in R: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Phil May 10 '16 at 09:37
  • `split(dat, cut(strptime(dat$ut_ms, format = '%F %R'), 'hour'))` will split your data into a list. You can the use `lapply` to itterate over the list. – Sotos May 10 '16 at 09:40
  • `by` would also work. – Roman Luštrik May 10 '16 at 10:08

2 Answers2

0

You can split your data per hour and itterate,

list1 <- split(dat, cut(strptime(dat$ut_ms, format = '%Y-%m-%d %H:%M'), 'hour'))
lapply(list1, colMeans)
Sotos
  • 51,121
  • 6
  • 32
  • 66
  • When I am using lapply I get error " x must be numeric" I noticed, that my time is in factor format. I tried to change time from factor to numeric, but than I can't split data. Is there a way to change format within lapply ? – Extria May 10 '16 at 14:01
0

When you rearrange the data into a long format, things get much easier

n.system <- 34
n.time <- 100
temp <- rnorm(n.time * n.system)
temp <- matrix(temp, ncol = n.system)
seconds <- runif(n.time, max = 3 * 3600)
time <- as.POSIXct(seconds, origin = "1970-01-01")
dataset <- data.frame(time, temp)

library(dplyr)
library(tidyr)
dataset %>%
  gather(key = "system", value = "temperature", -time) %>%
  mutate(hour = cut(time, "hour")) %>%
  group_by(system, hour) %>%
  summarise(average = mean(temperature))
Thierry
  • 18,049
  • 5
  • 48
  • 66