0

I have a slightly complicated plotting task. I am half way there, quite sure how to get it. I have a dataset of the form below, with multiple subjects, each in either Treatgroup 0 or Treatgroup 1, each subject contributing several rows of data. Each row corresponds to a single timepoint at which there are values in columns "count1, count2, weirdname3, etc.

Task 1. I need to calculate "Days", which is just the visitdate - the startdate, for each row. Should be an apply type function, I guess.

Task 2. I have to make a multiplot figure with one scatterplot for each of the count variables (a plot for count1, one for count2, etc). In each scatterplot, I need to plot the value of the count (y axis) against "Days" (x-axis) and connect the dots for each subject. Subjects in Treatgroup 0 are one color, subjects in treatgroup 1 are another color. Each scatterplot should be labeled with count1, count2 etc as appropriate.

I am trying to use the base plotting function, and have taken the approach of writing a plotting function to call later. I think this can work but need some help with syntax.

#Enter example data
tC <- textConnection("
ID  StartDate   VisitDate   Treatstarted    count1  count2  count3  Treatgroup
C0098   13-Jan-07   12-Feb-10   NA  457 343 957 0
C0098   13-Jan-06   2-Jul-10    NA  467 345 56  0
C0098   13-Jan-06   7-Oct-10    NA  420 234 435 0
C0098   13-Jan-05   3-Feb-11    NA  357 243 345 0
C0098   14-Jan-06   8-Jun-11    NA  209 567 254 0
C0098   13-Jan-06   9-Jul-11    NA  223 235 54  0
C0098   13-Jan-06   12-Oct-11   NA  309 245 642 0
C0110   13-Jan-06   23-Jun-10   30-Oct-10   629 2436    45  1
C0110   13-Jan-07   30-Sep-10   30-Oct-10   461 467 453 1
C0110   13-Jan-06   15-Feb-11   30-Oct-10   270 365 234 1
C0110   13-Jan-06   22-Jun-11   30-Oct-10   236 245 23  1
C0151   13-Jan-08   2-Feb-10    30-Oct-10   199 653 456 1
C0151   13-Jan-06   24-Mar-10   3-Apr-10    936 25  654 1
C0151   13-Jan-06   7-Jul-10    3-Apr-10    1147    254 666 1
C0151   13-Jan-06   9-Mar-11    3-Apr-10    1192    254 777 1
")
data1 <- read.table(header=TRUE, tC)
close.connection(tC)

# format date
data1$VisitDate <- with(data1,as.Date(VisitDate,format="%d-%b-%y"))

# stuck: need to define days as VisitDate - StartDate for each row of dataframe (I know I need an apply family fxn here)
data1$Days <- [applyfunction of some kind ](VisitDate,ID,function(x){x-data1$StartDate})))

# Unsure here. Need to define plot function
plot_one <- function(d){
 with(d, plot(Days, Count, t="n", tck=1, cex.main = 0.8, ylab = "", yaxt = 'n', xlab = "", xaxt="n",  xlim=c(0,1000), ylim=c(0,1200))) # set limits
    grid(lwd = 0.3, lty = 7)
    with(d[d$Treatgroup == 0,], points(Days, Count1, col = 1)) 
    with(d[d$Treatgroup == 1,], points(Days, Count1, col = 2))
}

#Create multiple plot figure
par(mfrow=c(2,2), oma = c(0.5,0.5,0.5,0.5), mar = c(0.5,0.5,0.5,0.5))
#trouble here. I need to call the column names somehow, with; plyr::d_ply(data1, ???, plot_one) 
marcel
  • 389
  • 1
  • 8
  • 21
  • 1
    Don't put `rm(list=ls())` in example code lest someone destroy their working data while trying to help. – thelatemail Jul 28 '14 at 05:52
  • 1
    Just a note: you can pass a string directly to the `text` argument to `read.table` without using a text connection. – Thomas Jul 28 '14 at 05:53

1 Answers1

0

Task 1:

data1$days <- floor(as.numeric(as.POSIXlt(data1$VisitDate,format="%d-%b-%y")
                              -as.POSIXlt(data1$StartDate,format="%d-%b-%y")))

Task 2:

par(mfrow=c(3,1), oma = c(2,0.5,1,0.5), mar = c(2,0.5,1,0.5))
plot(data1$days, data1$count1, col=as.factor(data1$Treatgroup), main="count1")
plot(data1$days, data1$count2, col=as.factor(data1$Treatgroup), main="count2")
plot(data1$days, data1$count3, col=as.factor(data1$Treatgroup), main="count3")
momobo
  • 1,755
  • 1
  • 14
  • 19
  • Great solution, thank you. Works exactly as intended. With this solution, is there a way to select the colors for each treatment group? – marcel Jul 29 '14 at 12:58