0

I would like to plot a life lines diagram for my data so that the readers can understand how the data is shaped and what right censoring does to the data.

Ideally I would like it to look something like [this][1]

I need a horizontal line for each participant, starting from the date of observation and ending on the last day we observed him. The people having their last day of observation should be in a different color (or have another indicator).

The data would look like this:

regdate             lastlogindate            censor   duration
2010-02-24 02:30:43 2010-05-27 07:58:17       0       92
2007-12-23 11:16:37 2008-03-07 10:36:29       1       75
2009-01-19 04:23:28 2009-01-24 06:33:38       1        5
2010-07-25 10:24:39 2010-08-11 07:13:25       0       17
2009-08-23 07:18:06 2009-08-24 06:25:35       1        1
2007-08-12 07:24:55 2010-06-01 06:53:57       0     1024

UCLA has how its done in Stata. I told my advisor I can match whatever he did in Stata in R. I am in need of some help here guys :)

EDIT: I finally managed to get it right.

Here is a sample of the data with dput.

structure(list(users_id = c(1747516, 913136, 921278, 1654913, 
782364, 1371798, 1174461, 1493894, 1124186, 1249310), 
regdate = c("2010-08-15 05:50:09", "2009-01-04 13:47:46", "2009-01-07 13:34:53", "2010-06-30 11:19:08", "2008-08-13 06:46:28", "2010-01-26 12:58:20", "2009-08-18 15:13:12", "2010-04-04 11:33:47", "2009-07-10 12:33:41", "2009-10-19 13:30:49" ), 
lastlogindate = c("2010-09-01 05:51:34", "2010-09-17 05:25:00", "2009-05-15 07:55:30", "2010-07-02 07:34:02", "2008-10-25 14:29:50",  "2010-03-17 05:04:58", "2010-07-06 03:48:48", "2010-04-09 19:44:42", "2010-09-03 04:18:18", "2009-10-20 06:26:55"), 
censor6 = c(0, 0, 1, 0, 1, 1, 0, 0, 0, 1)), 
.Names = c("users_id", "regdate", "lastlogindate", "censor6"), 
row.names = c(1L, 2L, 4L, 5L, 7L, 9L, 10L, 11L, 12L, 14L), 
class = "data.frame")

What I did was I melted the data with reshape2 package so that for each observation there were two rows. Start and end dates. Then I added the censoring variable with merge.

# Create a subset of the data with 25 observations
sampData1<-data[c("users_id", "regdate", "lastlogindate")]
sampData1<-sampData1[sample(1:nrow(sampData1),25),]
# Create two entries for each observation 1 for start date 1 for end
sampData1<-melt(sampData1, id.vars="users_id")
sampData1<-sampData1[order(sampData1$users_id, sampData1$value),]
# Add a grouping variable basically the same thing as user ID but looks better on plot
sampData1$ID<-rep(seq(1,nrow(sampData1)/2,1), each=2)
# Put back the censoring variable
sampData1<-merge(sampData1, data[,c("users_id", "censor6")])
sampData1$censor6<-as.factor(sampData1$censor6)
sampData1$value<-as.POSIXct(sampData1$value, origin="1970-01-01 00:00:00")

Now Let us create a plot

# Base Plot
gp<-ggplot(sampData1)

# Add the horizontal lines (This is the big deal)
gp+geom_line(aes(value, ID, group=ID, color=censor6, size=1))

# Decluter the x axis labels
gp+scale_x_datetime(breaks=date_breaks('3 month'))
# rotate x axis labels
gp+ theme(axis.text.x = element_text(angle = 45, hjust = 1))
# Change the legend label and colors
gp+scale_color_manual(values = c("red", "blue"))

And here is the result.

  • 1
    If you're familiar with ggplot2 `geom_pointrange` will take you a long way towards achieving this. http://docs.ggplot2.org/0.9.3.1/geom_errorbar.html. On a general note, your questions are more likely to generate reponses if you use `dput`to share your data rather than copy pasting it, since doing so makes it a whole lot easier for a potential respondee to recreate your situation. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Pewi Apr 09 '15 at 20:55
  • Hey thanks for the comment, I will investigate this pointrange function. (and will use dput next time :)) – PoorLifeChoicesMadeMeWhoIAm Apr 10 '15 at 01:46
  • OK I got this. I used the third solution on [this thread](http://stackoverflow.com/questions/17120729/ggplot2-plotting-non-contiguous-time-durations-as-a-bar-chart) as a basis – PoorLifeChoicesMadeMeWhoIAm Apr 11 '15 at 05:11

0 Answers0