install.packages("dplyr")
library(dplyr)
install.packages("magrittr")
library(magrittr)
patientsID<-unique(PATIENT$PatientID)
PATIENT$DOD<-as.Date(PATIENT$DOD)
patient_diag_dict<-list()
for (patient_id in patientsID) {
patientDiagnosisDF <- PATIENT %>%
filter(PatientID==patient_id)
chron_date <- patientDiagnosisDF$DOD
chron_ID <- patientDiagnosisDF$DID
patient_data <- data.frame('date' = chron_date,'id' = chron_ID)
patient_data <- patient_data[order(patient_data$date),]
patient_diag_dict[[toString(patient_id)]] <- patient_data
print(patient_diag_dict)
}
diagID<-c()
for(row in 1:nrow(ADMISSION)) {
print(row)
admis_row<-ADMISSION[row,]
patient<-admis_row$PatientID
proc_date<-as.Date(admis_row$PROCEDUREDATE)
if(!(toString(patient) %in% patient_diag_dict)) {
diagID <- c(diagID, NULL)
}
all_diag_date<-patient_diag_dict[[toString(patient)]]$date
all_diag_ids<-patient_diag_dict[[toString(patient)]]$id
mostRecentDiagID<-all_diag_ids[1]
for(i in 1:length(all_diag_date)) {
diag_date<-all_diag_date[i]
diag_id<-all_diag_ids[i]
if(proc_date >= diag_date) {
mostRecentDiagID<-diag_id
} else {
break
}
}
diagID <- c(diagID, mostRecentDiagID)
}
ADMISSION$DID<-diagID
I am trying to get the diagnosis date from the first data frame and the proc_date from the second data frame and trying to compare them. If procedure date is >= procedure date keep that record. If not break. But this break does not work inside the for loop. The two data frames are listed below.
PATIENT dataframe
PatientID DOD PRIMARYSITE
27 23-10-08 TRUNK
350 12-10-09 TRUNK
350 05-07-10 NECK
663 31-07-09 UPPERLIMB
663 25-02-09 TRUNK
663 24-06-09 TRUNK
585 03-10-11 HIP
736 30-01-13 OTHER
ADMISSION dataframe
PatientID PROCEDUREDATE ICD
27 25-09-13 SEDATION
27 25-09-13 LARGE BURSA
27 25-09-13 GENERAL ANESTHIA
27 18-06-04 SEDATION
27 16-07-04 LYMPHGROIN
27 31-08-04 GROIN
27 28-09-04 SEDATION
27 18-06-04 BIOPSY
663 20-04-10 DIETICS
663 10-02-09 SEDATION
663 15-03-11 EYELID
663 09-04-10 PHYSIOTHERAPY
663 20-08-09 BIOPSY
663 09-07-12 SEDATION
585 10-03-10 ANAESESIA
585 10-11-11 BIOPSY
585 08-09-13 SEDATION
585 12-04-08 DIETICS
736 02-05-09 CHEMO
736 09-07-14 BIOPSY
736 10-08-13 SEDATION
I am getting a dataframe ADMISSION dataframe
PatientID PROCEDUREDATE ICD DID
27 25-09-13 SEDATION 2
27 25-09-13 LARGE BURSA 2
27 25-09-13 GENERAL ANESTHIA 2
27 18-06-04 SEDATION 2
27 16-07-04 LYMPHGROIN 2
27 31-08-04 GROIN 2
27 28-09-04 SEDATION 2
27 18-06-04 BIOPSY 2
663 20-04-10 DIETICS 5
663 10-02-09 SEDATION 6
663 15-03-11 EYELID 5
663 09-04-10 PHYSIOTHERAPY 5
663 20-08-09 BIOPSY 5
663 09-07-12 SEDATION 5
585 10-03-10 ANAESESIA 8
585 10-11-11 BIOPSY 8
585 08-09-13 SEDATION 8
585 12-04-08 DIETICS 8
736 02-05-09 CHEMO 9
736 09-07-14 BIOPSY 9
736 10-08-13 SEDATION 9
But what I actually want is where the PROCEDUREDATE is after the DOD My output should be as below. Because I want to get the dates between the PROCEDUREDATE and DOD to determine the recurrence
ADMISSION dataframe
PatientID PROCEDUREDATE ICD DID
27 25-09-13 SEDATION 2
27 25-09-13 LARGE BURSA 2
27 25-09-13 GENERAL ANESTHIA 2
663 20-04-10 DIETICS 5
663 15-03-11 EYELID 5
663 09-04-10 PHYSIOTHERAPY 5
663 20-08-09 BIOPSY 5
663 09-07-12 SEDATION 5
585 10-11-11 BIOPSY 8
585 08-09-13 SEDATION 8
585 12-04-08 DIETICS 8
736 09-07-14 BIOPSY 9
736 10-08-13 SEDATION 9