patient.id date type
5 1053 2006/12/14 DX
2 1053 2007/4/21 HSCT
1 1053 2007/5/29 FU
6 1053 2007/7/20 FU
3 1053 2007/9/20 FU
4 1053 2007/11/18 D1
7 1138 2009/9/3 DX
13 1138 2010/2/3 HSCT
23 1138 2010/3/11 FU
10 1138 2010/6/6 FU
9 1138 2010/8/31 FU
15 1138 2010/11/5 FU
11 1138 2011/2/7 FU
16 1138 2011/5/15 FU
17 1138 2011/7/18 FU
14 1138 2011/9/21 FU
24 1138 2011/12/13 FU
19 1138 2012/3/13 FU
25 1138 2012/5/11 D1
Asked
Active
Viewed 150 times
-2
-
for example.for patient.id 1053, survival time is 2007/11/18-2006/12/14 – WENWEN LI Mar 21 '19 at 22:20
2 Answers
1
An R base solution:
> lapply(with(dat, split(date, patient.id)), function(x) diff(range(x)))
$`1053`
Time difference of 339 days
$`1138`
Time difference of 981 days

Jilber Urbina
- 58,147
- 10
- 114
- 138
-
if the last type of each patient is HSCT or D2, it‘s right censored,marked as 1, type D1 means death marked as 0. how to generate a column of such data – WENWEN LI Mar 23 '19 at 21:20
0
Use dplyr
to convert to date format, then group by patient and calculate max(date) - min(date).
library(dplyr)
mydata %>%
mutate(date = as.Date(date, "%Y/%m/%d")) %>%
group_by(patient.id) %>%
summarise(Survival = as.numeric(max(date) - min(date)))
Result:
patient.id Survival
<int> <dbl>
1 1053 339
2 1138 981

neilfws
- 32,751
- 5
- 50
- 63
-
how to calculate the days from HSCT type to the last day's type, because DX means diagnosis, HSCT means transplant. – WENWEN LI Mar 23 '19 at 20:56
-
From the second to the last data of each patient, is the patient's survival time, (HSCT) means Hematopoietic stem cell transplantation, DX means diagnosis – WENWEN LI Mar 23 '19 at 21:00