-2

My data set looks like below

ID     Diagnosis date    Procedure date
1      2005-09-09        2008-04-09
1      2006-05-09        2007-08-08
2      2007-07-02        2007-08-01
2      2007-07-02        2009-08-05
2      2008-05-8         2007-08-10

I want to group my data like below

ID      Diagnosis date     Procedure date
1       2005-09-09         2007-08-08
        2006-05-09         2008-04-09
2       2007-07-02         2007-08-01
                           2007-07-10
        2008-05-08         2009-08-05

Basically the procedure date should be after the diagnosis date

  • Great! Do you actually have spaces in your column names? What have you tried so far? – r2evans Sep 03 '18 at 04:36
  • Possible duplicate of https://stackoverflow.com/questions/52142189/sort-a-column-and-create-a-new-one – akrun Sep 03 '18 at 04:37
  • nope. there are no spaces in column names – user10277618 Sep 03 '18 at 04:39
  • I was trying to create a separate column to match with the diagnosis date – user10277618 Sep 03 '18 at 04:40
  • 3
    This isn't about arranging column data, it's about modifying data between columns. If your data is truly `Date` objects, then you should be able to solve your problem with `pmin` and `pmax`. – r2evans Sep 03 '18 at 04:48
  • You'd probably be better off stacking this in a long form. `ID / Date / DateType` where `DateType` has two options - `"Diagnosis"` or `"Procedure"`. That way you could create a counter going down the page to identify each set of related events. – thelatemail Sep 03 '18 at 05:13

2 Answers2

1

How about this solution. Some sample data:

dat <- read.table(header=TRUE, stringsAsFactors=FALSE, text='
ID     Diagnosis         Procedure
1      2005-09-09        2008-04-09
1      2006-05-09        2007-08-08
2      2007-07-02        2007-08-01
2      2007-07-02        2009-08-05
2      2008-05-08        2007-08-10')

Convert them to Date objects. (Since the strings are well-formatted, though, this might work just fine without converting them to dates. I guess it's just my habit to make it a "proper date object".)

dat$Diagnosis <- as.Date(dat$Diagnosis)
dat$Procedure <- as.Date(dat$Procedure)

min returns a single minimum value for a vector. pmin returns the pair-wise minima between vectors:

min(c(1,1,3,4), c(2,2,4,3))
# [1] 1
pmin(c(1,1,3,4), c(2,2,4,3))
# [1] 1 1 3 3

We can use that compare the two columns:

tmp1 <- pmin(dat$Diagnosis, dat$Procedure)
tmp2 <- pmax(dat$Diagnosis, dat$Procedure)

and store them back in-place:

dat$Diagnosis <- tmp1
dat$Procedure <- tmp2
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Nope. Its not doing what I am expecting – user10277618 Sep 03 '18 at 05:22
  • Forgive my sarcasm, but ... with *that* clarity, I'm sure I can straighten things out right away. You only said *"procedure date should be after the diagnosis date"* **which this does**, but you apparently have unspoken requirements, too. Perhaps you have no problem mixing dates from different rows? Why does your first procedure go after the first diagnosis and not after the second? Why are your procedures not all ordered? Do you actually expect subsequent same `ID`s to be blank? Is this for data processing or just for display? You need to improve your question to get a better answer. – r2evans Sep 03 '18 at 16:22
  • Good refs for reproducibility: https://stackoverflow.com/questions/5963269/, https://stackoverflow.com/help/mcve, and https://stackoverflow.com/tags/r/info. Please take a quick look at those and think about how you can improve your question. – r2evans Sep 03 '18 at 16:23
1

Hope the following code solves:

library(dplyr)
data <- data.frame(ID =  c(1,1,2,2,2), Diagnosis = c("2005-09-09","2006-05-09","2007-07-02","2007-07-02","2008-05-08"),
                   Procedure =  c("2008-04-09","2007-08-08","2007-08-01","2009-08-05","2007-08-10"))

data$Diagnosis <- as.Date(data$Diagnosis)
data$Procedure <- as.Date(data$Procedure)


data1 <- data[,-2] %>%
  group_by(ID) %>%
  arrange( ID,Procedure)
out <- data.frame(data1,data[2])
out <- out[,c(1,3,2)]
out

ID  Diagnosis  Procedure
1  1 2005-09-09 2007-08-08
2  1 2006-05-09 2008-04-09
3  2 2007-07-02 2007-08-01
4  2 2007-07-02 2007-08-10
5  2 2008-05-08 2009-08-05
Sri Pallavi
  • 196
  • 5