I would like to merge two databases with inner and left joint based on n_pz and dates.
In db_1 I have date of admission and discharge.
In db_2 I have date of exams.
I need a database with only the patients that had exams during the recovery (inner joint) and an other database with all the patients that had recovery complete with exams done during the recovery (left joint).
Here's the problem. I have date of admission and disharge in db_1 and I have to keep only the record from db_2 with dates between the date of admission and that of discharge.
db_1:
n_pz admission_date discharge_date
1 1 02/01/2000 07/01/2000
2 2 15/01/2000 16/01/2000
3 3 20/02/2000 27/02/2000
4 4 27/02/2000 29/02/2000
5 5 10/03/2000 13/03/2000
6 6 20/04/2000 25/04/2000
db_2
n_pz buffer_date buffer_result
1 1 01/01/2000 0
2 1 02/01/2000 1
3 1 03/01/2000 0
4 1 05/01/2000 0
5 2 15/01/2000 1
6 2 17/01/2000 1
7 3 25/02/2000 0
8 4 26/02/2000 1
9 5 15/03/2000 0
10 6 17/03/2000 1
11 6 24/04/2000 0
I don't know how to use merge
in this case cause I don't have the same variable to merge..
I would like to obtain:
out_db (left joint)
n_pz admission_date discharge_date buffer_date buffer_result
1 02/01/2000 07/01/2000 02/01/2000 1
1 02/01/2000 07/01/2000 03/01/2000 0
1 02/01/2000 07/01/2000 05/01/2000 0
2 15/01/2000 16/01/2000 15/01/2000 1
3 20/02/2000 27/02/2000 25/02/2000 0
4 27/02/2000 29/02/2000
5 10/03/2000 13/03/2000
6 20/04/2000 25/04/2000 24/04/2000 0
out_db (inner joint)
n_pz admission_date discharge_date buffer_date buffer_result
1 02/01/2000 07/01/2000 02/01/2000 1
1 02/01/2000 07/01/2000 03/01/2000 0
1 02/01/2000 07/01/2000 05/01/2000 0
2 15/01/2000 16/01/2000 15/01/2000 1
3 20/02/2000 27/02/2000 25/02/2000 0
6 20/04/2000 25/04/2000 24/04/2000 0
I've tried in ths way but it didn't work:
fmt <- "%d/%m/%Y"
db_11 <- transform(db_1, admission_date = as.Date(admission_date, fmt),
discharge_date = as.Date(discharge_date, fmt))
db_21 <- transform(db_2, buffer_date = as.Date(buffer_date, fmt))
s <- subset(merge(db_11, db_21,by = "n_pz", all.x = TRUE), admission_date <= buffer_date & discharge_date >= buffer_date)
merge(db_11, s, by = "n_pz",all.x = TRUE)
I obtain a db with 0 observations.
I've tried also in this way:
fmt <- "%d/%m/%Y"
db_11 <- transform(db_1, admission_date = as.Date(admission_date, fmt),
discharge_date = as.Date(discharge_date, fmt))
db_21 <- transform(db_2, buffer_date = as.Date(buffer_date, fmt))
df3 <- dplyr::left_join(db_11, db_21, by=c("n_pz"," admission_date " <= " buffer_date " & " discharge_date " >= " buffer_date "))
But I obtain this error message:
Errore: Join columns must be present in data.
x Problem with `FALSE`.
Then I would like to transpose multiple data for dates and results of exams.. but maybe I should create a different post for this..
out db (left joint) with transpose
n_pz admission_date discharge_date buffer_date buffer_result buffer_date_1 buffer_result_1 buffer_date_1 buffer_result_1
1 02/01/2000 07/01/2000 02/01/2000 1 03/01/2000 0 05/01/2000 0
2 15/01/2000 16/01/2000 15/01/2000 1
3 20/02/2000 27/02/2000 25/02/2000 0
4 27/02/2000 29/02/2000
5 10/03/2000 13/03/2000
6 20/04/2000 25/04/2000 24/04/2000 0
out db (inner joint) with transpose
n_pz admission_date discharge_date buffer_date buffer_result buffer_date_1 buffer_result_1 buffer_date_1 buffer_result_1
1 02/01/2000 07/01/2000 02/01/2000 1 03/01/2000 0 05/01/2000 0
2 15/01/2000 16/01/2000 15/01/2000 1
3 20/02/2000 27/02/2000 25/02/2000 0
6 20/04/2000 25/04/2000 24/04/2000 0