How to take data from Average of Hour ( from a 30 seconds data sheet) in R

Question

I'm working on a huge data sheet in excel which has data in every 30 seconds of a day. . Two columns I need to perform the task are time and power . I'm asked to find the daily power in hour basis . So I need to take the average of each hour from 30 seconds data in R studio . Please help me with it . I'm new to this software .

also its a normal excel sheet with time and power values . but the time is in the formate of dd:mm:yy hh:mm:ss. And I need to plot the graph of power and time (hourly basis) . for a particular day , giving times eg , 8 o clock , 9o clock etc

Can you post sample data? Please edit the question with the output of `dout(head(your_data, 20))`. — Rui Barradas, Aug 09 '18 at 06:33
Please provide sample data in a copy&paste-able format (e.g. using `dput`) and expected output. It might be useful to take a look at how to provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). — A. Suliman, Aug 09 '18 at 06:33
its a normal excel sheet with time and power values . format of time is date:month:year hour:minute:second . and time is for every 30 seconds . example 6:21:00 , 6:21:30 , 6:22:00 etc — Fatma Zahra, Aug 09 '18 at 06:43
You shouldn't post data (or code) as images. Please use `dput` as explained above. Also, you want the averages of what column? The last one, `m_Pm`? — Rui Barradas, Aug 09 '18 at 07:14
That is just a matter of changing the variable name in my answer, from `x` to `m_Pm`. Done. — Rui Barradas, Aug 09 '18 at 07:19
@RuiBarradas it is showing current date, i want the output of the date provided :( — Fatma Zahra, Aug 09 '18 at 07:32
I don't understand, the `format` instruction will keep ymd and hour. Can you post the code you are using? — Rui Barradas, Aug 09 '18 at 07:44
@RuiBarradas the answer you provided is perfect . but , seq(as.POSIXct(Sys.Date()) - 1, Sys.time(), by = "30 secs") this command makes the data use today's date. and not the date from my data — Fatma Zahra, Aug 09 '18 at 07:45
@RuiBarradas I need to get daily average of m_pm on hour basis of the date given , for example i want from 2017-09-26 to 2017-09-27 (that's 1day) — Fatma Zahra, Aug 09 '18 at 07:47
If you want **daily** averages, try `tapply(dat$m_Pm, as.Date(dat$time), mean)` and `aggregate(m_Pm ~ as.Date(time), dat, mean)`. — Rui Barradas, Aug 09 '18 at 08:02
You are not understanding, `as.POSIXct` is just to make an example dataset, you do **not** need to use it in your code, what you must use is `format` and `tapply` or `aggregate`. — Rui Barradas, Aug 09 '18 at 08:09
@RuiBarradas okay sorry , thanks again . I'm new to this ... — Fatma Zahra, Aug 09 '18 at 08:20
@RuiBarradas wont disturb again . just my last doubt ... How to get one day's energy on hourly basis . for example , i want energy of 2017-09-26 in hourly basis , as in 8.o clock , 9 o clock , 10 o clock etc (by taking hourly average ). PLEASE HELP — Fatma Zahra, Aug 09 '18 at 08:50
You can aggregate by several variables, `aggregate(m_Pm ~ as.Date(time) + h, dat, mean)`. — Rui Barradas, Aug 09 '18 at 08:56
@RuiBarradas Sir , I still am not able to plot graph for a single day with time. :( — Fatma Zahra, Aug 12 '18 at 06:53

Rui Barradas · Answer 1 · 2018-08-12T08:03:33.780

In base R, you can use tapply or aggregate to compute group means.

First, some fake data, since you have posted none.
The time variable is created with seq.POSIXt, starts today minus 1 day (Sys.Date - 1) and end today (Sys.time) by increments of 30 seconds.
The numeric variable is just a gaussian random variable.

set.seed(3224)

time <- seq(as.POSIXct(Sys.Date()) - 3, Sys.time(), by = "30 secs")
dat <- data.frame(time, m_Pm = cumsum(rnorm(length(time))))

Now, start by getting just the hours, then find group means.

dat$datehour <- format(dat$time, "%Y-%m-%d %H")

tapply(dat$m_Pm, dat$datehour, mean)
#2018-08-12 00 2018-08-12 01 2018-08-12 02 2018-08-12 03 2018-08-12 04 
#    0.5159274     8.2868756    24.8056718    26.4107549    27.6064418 
#2018-08-12 05 2018-08-12 06 2018-08-12 07 2018-08-12 08 
#   33.0096640    42.7479312    40.2468852    41.5313523


agg <- aggregate(m_Pm ~ datehour, dat, mean)

head(agg)
#       datehour       m_Pm
#1 2018-08-12 00  0.5159274
#2 2018-08-12 01  8.2868756
#3 2018-08-12 02 24.8056718
#4 2018-08-12 03 26.4107549
#5 2018-08-12 04 27.6064418
#6 2018-08-12 05 33.0096640

As for the graph, I will draw a line graph using package `ggplot2`. The x axis is formated with package `scales` function `scale_x_datetime`.

library(ggplot2)
library(scales)


ggplot(agg, aes(x = as.POSIXct(datehour, format = "%Y-%m-%d %H"), y = m_Pm)) + 
  geom_point() +
  geom_line() +
  labs(x = "Hour", y = "Mean value") +
  scale_x_datetime(labels = date_format("%Y-%m-%d %H"))

time <- seq(as.POSIXct(Sys.Date()) - 1, Sys.time(), by = "30 secs") dat <- data.frame(time, x = rnorm(length(time))) can you explain me this , please ? — Fatma Zahra, Aug 09 '18 at 06:49
@FatmaZahra Done, I hope this is it. I have changed everything. — Rui Barradas, Aug 12 '18 at 08:03
i'm getting a plot of whole data set. so i cannot get a single day's data ?? — Fatma Zahra, Aug 13 '18 at 05:27
@FatmaZahra You can subset by date if you want. `agg[as.Date(agg$datehour) == as.Date("2018-08-11"), ]`. — Rui Barradas, Aug 13 '18 at 06:08

score 0 · Answer 2 · answered Aug 09 '18 at 14:44

It would be much more helpful if you posted a reproducible dataframe. Anyway, I'll just give you a generic solution...

df <- read.table(header=TRUE, stringsAsFactors=FALSE, text="
date hour temperature
28/12/2013 13:03:01  41.572
28/12/2013 13:08:01  46.059
28/12/2013 13:13:01  48.55
28/12/2013 13:18:01  49.546
28/12/2013 13:23:01  49.546
28/12/2013 13:28:01  49.546
28/12/2013 13:33:01  50.044
28/12/2013 13:38:01  50.542
28/12/2013 13:43:01  50.542
28/12/2013 13:48:01  51.04
28/12/2013 13:53:01  51.538
28/12/2013 13:58:01  51.538
28/12/2013 14:03:01  50.542
28/12/2013 14:08:01  51.04
28/12/2013 14:13:01  51.04
28/12/2013 14:18:01  52.534
28/12/2013 14:23:01  53.031
28/12/2013 14:28:01  53.031
28/12/2013 14:33:01  53.031
28/12/2013 14:38:01  51.538
28/12/2013 14:43:01  53.031
28/12/2013 14:48:01  53.529
28/12/2013 15:01:01  50.77")

means <- aggregate(temperature ~ datehour, df, mean)

# Result:
means
             datehour temperature
1 2013-12-28 13:00:00    49.17192
2 2013-12-28 14:00:00    52.23470
3 2013-12-28 15:00:00    50.77000

thank you . but how do I select just a day ?? for example I want daily average plot . so I need for example average power of 2017-09-26 8 o clock , 9 o clock data — Fatma Zahra, Aug 11 '18 at 06:32

How to take data from Average of Hour ( from a 30 seconds data sheet) in R

2 Answers2