I have 2 datasets that looks like the following (with about 600 total entries for both) Dataframe1
name | A_time | measurement_A |
---|---|---|
A | 6/10/22 7:22 | 3.4 |
A | 6/10/22 16:15 | 4.4 |
A | 6/11/22 6:15 | 5.5 |
A | 6/11/22 15:44 | 6.6 |
B | 5/20/22 6:30 | 7.6 |
B | 5/20/22 13:21 | 1.2 |
B | 5/21/22 8:43 | 2.4 |
B | 5/21/22 14:35 | 4.4 |
Dataframe2
name | B_time | measurement_B |
---|---|---|
A | 6/10/22 7:24 | 50 |
A | 6/10/22 16:55 | 50.1 |
A | 6/11/22 6:13 | 52 |
A | 6/11/22 15:21 | 61 |
B | 5/20/22 6:30 | 54 |
B | 5/20/22 13:22 | 53 |
B | 5/21/22 8:00 | 54.6 |
B | 5/21/22 14:12 | 76.1 |
I want to combine these two dataframes by name and date, selecting for rows only where Dataframe1 and Dataframe2 are within 30 minutes of each other.
I'm hoping to get something like this: Dataframe3
name | A_time | measurement_A | measurement_B |
---|---|---|---|
A | 6/10/22 7:22 | 3.4 | 50 |
A | 6/11/22 6:15 | 5.5 | 52 |
A | 6/11/22 15:44 | 6.6 | 61 |
B | 5/20/22 6:30 | 7.6 | 54 |
B | 5/20/22 13:21 | 1.2 | 53 |
B | 5/21/22 14:35 | 4.4 | 76.1 |
The code I am using:
library(dplyr)
dataframe3 <- dataframe1 %>%
inner_join(dataframe2, by = "name") %>%
filter(abs(A_time - B_time) <= as.difftime(30, units = "mins"))
But I am getting an error because the names in dataframe1 correlated to multiple rows in dataframe2. How can I get the Dataframe3 that I want?