0

A data.frame contains 2 columns with values. Each hour both columns contain a value and a NA, but never two values.

I'm looking for a way the merge two rows that have the same hour-day-month-year combination. Example:

# before
col1 col2 hour day month year
3    NA   1    3   2     2018  
NA   4    1    3   2     2018


#after
col1 col2 hour day month year
3    4    1    3   2     2018  

I tried using methods form the plyr library (ddply, paste) and group by the date columns, but I can't get my head around it.

donnut
  • 700
  • 9
  • 19

1 Answers1

1

Use group_by and summarise_all from dplyr:

library(dplyr)

df %>%
  group_by(year, month, day, hour) %>%
  summarise_all(sum, na.rm=TRUE)

# A tibble: 2 x 6
# Groups:   year, month, day [2]
   year month   day  hour  col1  col2
  <int> <int> <int> <int> <int> <int>
1  2018     2     3     1     3     4
2  2018     4     3     1     8     9

Data:

df
  col1 col2 hour day month year
1    3   NA    1   3     2 2018
2   NA    4    1   3     2 2018
3    8   NA    1   3     4 2018
4   NA    9    1   3     4 2018
andrew_reece
  • 20,390
  • 3
  • 33
  • 58