I have tried the following methods beforehand, without success:
t1$date <- dmy(t1$date_admission)
I have been trying to calculate the difference in time between two columns. Somehow, R does not recognize the format Y-m-d in one of them and returns me a wrong value, as it follows:
> [1] "2020-06-07" "2020-09-07" "2020-02-08" "2020-08-15" "2020-08-15" "2020-08-18" "2020-08-25" "2020-08-29" "2020-06-30"
[10] "2020-05-07" "2020-07-15" "2020-08-14" "2020-01-09" "2020-09-09" "2020-12-09" "2020-02-07" "2020-09-07" "2020-02-08"
[19] "2020-08-15" "2020-02-09" "2020-06-07" "2020-06-07" "2020-07-29" "2020-08-16" "2020-08-21" "2020-08-22" "2020-01-07"
[28] "2020-04-07" "2020-02-07" "2020-01-09" "2020-06-07" "2020-09-08" "2020-10-08" "2020-08-14" "2020-08-27" "2020-08-30"
[37] "2020-07-16" "2020-07-23" "2020-09-14" "2020-01-07" "2020-04-07" "2020-07-07" "2020-07-07" "2020-10-07" "2020-07-25"
[46] "2020-03-08" "2020-08-31" "2020-02-07" "2020-06-07" "2020-08-13" "2020-08-24" "2020-01-07" "2020-07-18" "2020-09-15"
[55] "2020-01-07" "2020-07-07" "2020-07-17" "2020-07-27" "2020-08-14" "2020-10-09" "2020-09-14" "2020-04-08" "2020-01-07"
[64] "2020-01-07" "2020-12-07" "2020-07-27" "2020-04-08" "2020-08-16" "2020-02-07" "2020-07-07" "2020-07-20" "2020-08-19"
[73] "2020-03-09" "2020-05-09"
> print(df$data_inicio_sint)
[1] "2020-06-27" NA "2020-07-29" NA "2020-07-31" "2020-08-19" "2020-08-22" "2020-08-18" "2020-06-29"
[10] "2020-06-25" "2020-07-14" "2020-05-09" "2020-01-10" "2020-08-31" "2020-08-30" "2020-06-28" "2020-09-08" "2020-07-23"
[19] "2020-12-09" "2020-08-22" "2020-04-08" "2020-06-25" "2020-07-20" "2020-08-16" "2020-12-09" "2020-08-23" "2020-06-30"
[28] "2020-06-26" "2020-03-31" "2020-08-23" "2020-06-21" "2020-07-29" "2020-07-29" "2020-08-01" "2020-08-19" "2020-08-14"
[37] "2020-06-30" "2020-07-22" "2020-09-10" "2020-07-01" "2020-02-08" "2020-06-08" "2020-06-23" "2020-06-27" "2020-07-17"
[46] "2020-07-29" "2020-08-31" "2020-06-20" "2020-03-08" "2020-02-09" "2020-08-24" "2020-01-08" "2020-06-08" "2020-10-10"
[55] "2020-06-23" "2020-05-08" "2020-10-08" "2020-07-24" "2020-07-09" "2020-08-29" "2020-10-10" "2020-02-09" "2020-06-23"
[64] "2020-06-22" "2020-08-08" "2020-07-21" "2020-07-28" "2020-05-09" "2020-06-19" "2020-07-08" "2020-07-14" "2020-10-09"
[73] "2020-01-10" "2020-12-09"
> diff(df$data_int_uti - df$data_inicio_sint)
Time differences in days
[1] NA NA NA NA -16 4 8 -10 -50 50 96 -98 10 92 -243 141 -165 50 -79 255 -78 27 -9
[24] -110 109 -174 95 27 -174 213 55 30 -58 -5 8 0 -15 3 -180 235 -30 -15 88 -94 -151 143
[47] -134 225 95 -186 -1 41 -65 -143 228 -143 86 33 5 -67 85 -227 1 288 -115 -117 210 -232 132
[70] 7 -57 110 -273
Expected outcome: Time interval between date of symptoms and date of admission in hospital, in days, e.g.
(2020-06-07) - (2020-06-27) = 20 days
So the output would look like [1] 20 and so on
Any light would be greatly appreciated.
Here's the dput:
dput(t1) structure(list(data_int_uti = structure(c(18420, 18512, 18300, 18489, 18489, 18492, 18499, 18503, 18443, 18389, 18458, 18488, 18270, 18514, 18605, 18299, 18512, 18300, 18489, 18301, 18420, 18420, 18472, 18490, 18495, 18496, 18268, 18359, 18299, 18270, 18420, 18513, 18543, 18488, 18501, 18504, 18459, 18466, 18519, 18268, 18359, 18450, 18450, 18542, 18468, 18329, 18505, 18299, 18420, 18487, 18498, 18268, 18461, 18520, 18268, 18450, 18460, 18470, 18488, 18544, 18519, 18360, 18268, 18268, 18603, 18470, 18360, 18490, 18299, 18450, 18463, 18493, 18330, 18391), class = "Date"), data_inicio_sint = structure(c(18440, NA, 18472, NA, 18474, 18493, 18496, 18492, 18442, 18438, 18457, 18391, 18271, 18505, 18504, 18441, 18513, 18466, 18605, 18496, 18360, 18438, 18463, 18490, 18605, 18497, 18443, 18439, 18352, 18497, 18434, 18472, 18472, 18475, 18493, 18488, 18443, 18465, 18515, 18444, 18300, 18421, 18436, 18440, 18460, 18472, 18505, 18433, 18329, 18301, 18498, 18269, 18421, 18545, 18436, 18390, 18543, 18467, 18452, 18503, 18545, 18301, 18436, 18435, 18482, 18464, 18471, 18391, 18432, 18451, 18457, 18544, 18271, 18605), class = "Date")), row.names = c(NA, -74L), class = c("tbl_df", "tbl", "data.frame"))