0

I need to aggregate by Date, my "Day" dataset:

>

head(Day)
        Date Day Month Year  TimeDay Room Temperature Light     RH
1 02/09/2013   2     9 2013 08:00:00    2        21.7 71.76 100.00
2 02/09/2013   2     9 2013 08:15:00    2        21.7 61.27 100.00
3 02/09/2013   2     9 2013 08:30:00    2        21.7 58.96 100.00
4 02/09/2013   2     9 2013 08:45:00    2        21.8 52.96 100.00
5 02/09/2013   2     9 2013 09:00:00    2        22.0 59.92  86.26
6 02/09/2013   2     9 2013 09:15:00    2        22.2 65.12  84.01

but including the column 6, which corresponds to Room number:

newDay <- aggregate(Day[, 6:9], list(Day$Date), mean,na.rm=TRUE)

I got the following warning:

There were 50 or more warnings (use warnings() to see the first 50)

and the "Room" column in the new dataset "newDay" results in NAs.

Is it because the "Room" column is a factor? How should I deal this issue?

Luisa
  • 35
  • 2
  • 12

1 Answers1

0

Since you don't need TimeDay I'll just remove it because the mean function can't be applied. And I'll do it with dplyr's summarise_each and group_by instead of aggregate. In your example you used mean, so I used it too.

    Day$TimeDay <- NULL
    library(dplyr)
    newDay <- summarise_each(group_by(Day, Date), funs(mean)) %>%
              select(-Day, -Month, -Year, -Room)

Edit: Added pipe, thanks @r2evans. Removed Room, as it's not necessary.

mmstan
  • 1
  • 3
  • Since you introduced dplyr, would the column removal be even easier with `... %>% select(-Day, -Month, -Year)`? – r2evans Nov 06 '15 at 15:56
  • Thank you @mmstan! Maybe, I didn´t explain clearly what I want to do. I would like to get a dataframe aggregated by Date. As you can see from the dataset there are many observations in the date 02/09/2013, and I want to make the mean of the temperature for that date, in order to get only one observation/row per day.. – Luisa Nov 06 '15 at 15:57
  • I do get your desired output with this when I recreate your dataset, one date per row and means of temperature. Is usage of aggregate function mandatory? – mmstan Nov 06 '15 at 17:19
  • @mmstan Have you tried this? newDay <- summarise_each(group_by(Day, Date), funs(mean)) %>% select(-Day, -Month, -Year) Why the "Room" column result is NA ? – Luisa Nov 09 '15 at 11:17
  • @Luisa function mean() is applied to column "Room", if it produces NA there might be NAs in your "Room" column. But even if there were no NAs, with the command you've written you'll get mean of Room numbers, which is not very useful information. You'll probably want to either include Room in "group_by(Day, Date, Room)" to have rows aggregated by Date and Room, or to remove it from the resulting data frame with this part "select(-Day, -Month, -Year, -Room)" if the Room is irrelevant. – mmstan Nov 09 '15 at 11:48
  • Thank you! I got it now! – Luisa Nov 09 '15 at 11:54