0

I have data frame that contain 14 parameters (like speed, direction, etc..) of birds on different dates, Heights and Times. The time column is divide by 10 minutes as: 0,10,20,30,40,50,100 (as 1:00), 110,120,130,140 (as 01:40),150,200...

I need to calculate the mean for all the parameters for each hour in a specific date and height, and I need the hours to be half an hour before and after "full hour" like from 02:30-03:30 (or as it is in my data 230-330).I can do it by jumping in 100 in the time (except from in the first and last half an hour): 30-130,130-230,230-330 etc..

the data contain 133,280 rows and 14 parameters. This is how part of my data looks like: datamean

here is a sample:

df <- structure(list(Date = c(20160401, 20160401, 20160401, 20160401, 20160401, 20160401), Height = c(1200, 1200, 1200, 1400, 1400, 1400), Time = c(2330, 2340, 2350, 0, 10, 20), U = c(-9.55828285217285, -9.64695262908935, -9.67818069458007, -4.78218698501586, -4.87779474258422, -5.00569248199462), V = c(1.84902167320251, 2.02197194099426, 1.70393645763397, 3.40449619293212, 3.01245355606079, 2.91069912910461 )), class = "data.frame", .Names = c("Date", "Height", "Time", "U", "V"), row.names = c(NA, -6L))

I need each Date and Height (0-3800 in gap of 200) to have 25 rows of the hours: 0-30,30-130,130-230,230-330,330-430,........2230-2330, 2330-2350 with the mean of all the parameters.

The data frame I want to get will look like:

Date   Height   Time         U        V         W    Speed Direction 
20160401  0  0-30     -5.53     1.8     -25.13    8.5    265.35
20160401  0  30-130    -4.7      2.1     -35.19    5.3    270.23
.
.
.
.
20160401 200 0-30     -5.53     1.8     -25.13    8.5    265.35
.
.
20160402  0  0-30    -4.7      2.1     -35.19    5.3    270.23

Can anyone help me? Thanks

Jan
  • 3,825
  • 3
  • 31
  • 51
  • 3
    Please don't post screen shot of your data. Provide a reproducible example of your data frame. You may use `dput` to do that. – www Jun 25 '17 at 14:17
  • You can find how to make your example reproducible [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Thomas K Jun 25 '17 at 19:46
  • Sorry, It's the first time I ask here a question. I first tried to just copy strait from R but I don't know why, it didn't look good. anyway, here is part of my data with dput. I used this part because I looked for places with no NA... (I will put it in the next comments because there is not enough space in this one. – Inbal Goldstein Jun 26 '17 at 06:15
  • > dput(as.data.frame(datmean[1006:1011,1:5])) structure(list(Date = c(20160401, 20160401, 20160401, 20160401, 20160401, 20160401), Height = c(1200, 1200, 1200, 1400, 1400, 1400), Time = c(2330, 2340, 2350, 0, 10, 20), U = c(-9.55828285217285, -9.64695262908935, -9.67818069458007, -4.78218698501586, -4.87779474258422, -5.00569248199462), V = c(1.84902167320251, 2.02197194099426, 1.70393645763397, 3.40449619293212, 3.01245355606079, 2.91069912910461 )), class = "data.frame", .Names = c("Date", "Height", "Time", "U", "V"), row.names = c(NA, -6L)) – Inbal Goldstein Jun 26 '17 at 06:21
  • You can do it with dplyr in this way: `library(dplyr) df %>% mutate (new_height = as.integer(Height/200) * 200, new_time= as.integer(Time/100)*100 +as.integer(Time%%100/30)*30) %>% group_by (Date, new_height, new_time) %>% summarise_all(funs(mean(.,na.rm=T)))` what is yet missing is a) the Time is not given as value 0-30 but as 0 or 30, b) it is unclear if 30 should be in the 0-30 or in the other group (currently it is in the other group), c) the time group is not name in the format "from-to" but only "from" – Jan Jun 26 '17 at 07:31
  • Thank you! It helped me a lot. I did some small changes and it solve my problem. – Inbal Goldstein Jun 29 '17 at 12:47

0 Answers0