I am trying to merge two dataframes by date in R.
The first dataframe records daily temperatures. It has only 28 rows, and no dates are repeated.
head(df1)
Day MaxTemp MinTemp
2019-06-15 23.8 14.4
2019-06-16 24.9 11.7
2019-06-17 23.2 8.7
The second dataframe records hourly temperatures, and so has many more rows, with dates repeated.
head(df2)
Day Hour Temp
2019-06-15 14 22.8
2019-06-15 15 22.4
2019-06-15 16 21.9
I would like to merge the data to look something like this:
Day MaxTemp MinTemp Hour Temp
2019-06-15 14 22.8 23.8 14.4
2019-06-15 15 22.4 23.8 14.4
2019-06-15 16 21.9 23.8 14.4
But what I end up with is:
allData <-merge(df1, df2, by="Day", all.y=T)
head(allData)
Day Hour Temp MaxTemp MinTemp
2019-06-15 14 22.8 NA NA
2019-06-15 15 22.4 NA NA
2019-06-15 16 21.9 NA NA
Or if I try "all = T" in the arguments I get "Error in x[[n]][i] <- value[[n]] : replacement has length zero".
Does anyone have any idea how I can fix this?
Edit:
# head of df1
df1 <- structure(list(Day = structure(list(sec = c(0, 0, 0, 0, 0, 0),
min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L, 0L, 0L, 0L,
0L, 0L), mday = 15:20, mon = c(5L, 5L, 5L, 5L, 5L, 5L), year = c(119L,
119L, 119L, 119L, 119L, 119L), wday = c(6L, 0L, 1L, 2L, 3L,
4L), yday = 165:170, isdst = c(1L, 1L, 1L, 1L, 1L, 1L), zone = c("CDT",
"CDT", "CDT", "CDT", "CDT", "CDT"), gmtoff = c(NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
)), class = c("POSIXlt", "POSIXt")), Max = c(23.8, 24.9, 23.2, 22.4, 25.1, 24.4), Min = c(14.4, 11.7, 8.7, 8.7, 9.8, 10)), row.names = c(NA, 6L), class ="data.frame")
# head of df2
df2 <- structure(list(Date = structure(list(sec = c(0, 0, 0, 0, 0, 0),
min = c(0L,30L, 0L, 30L, 0L, 30L), hour = c(14L, 14L, 15L, 15L, 16L, 16L),
mday = c(15L, 15L, 15L, 15L, 15L, 15L), mon = c(5L, 5L, 5L, 5L, 5L, 5L),
year = c(119L, 119L, 119L, 119L, 119L, 119L), wday = c(6L, 6L, 6L, 6L, 6L,
6L), yday = c(165L,165L, 165L, 165L, 165L, 165L), isdst = c(1L, 1L, 1L, 1L,
1L, 1L), zone =c("CDT", "CDT", "CDT", "CDT", "CDT", "CDT"), gmtoff =
c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_)),class = c("POSIXlt","POSIXt")), Temp = c(22.8, 22.4, 22.4,
22.3,21.9, 21.3), Hour =c(14L, 14L, 15L, 15L, 16L, 16L), Day =
structure(c(18062,18062, 18062, 18062, 18062, 18062), class = "Date")),
row.names= c(NA, 6L), class = "data.frame")