1

I have an existing ggplot2 scatterplot which shows the results of a parameter against from normal database. I then want to add two additional points to this graph which I would pass as command line arguments to my script script age value1 value2. I would like to show these points as red with an r and l geom_text above each point. I have the following code so far but do not know how to add the finishing touches

pkgLoad <- function(x)
  {
    if (!require(x,character.only = TRUE))
    {
      install.packages(x,dep=TRUE, repos='http://star-www.st-andrews.ac.uk/cran/')
      if(!require(x,character.only = TRUE)) stop("Package not found")
    }

  }

pkgLoad("ggplot2")
#load current normals database
df<-data.frame(read.csv("dat_normals.txt", sep='\t', header=T))

args<-commandArgs(TRUE)

#specify what each argument is
age <- args[1]
rSBR <- args[2]
lSBR <- args[3]

# RUN REGRESSION AND APPEND PREDICTION INTERVALS
lm_fit = lm(SBR ~ Age, data = df)
sbr_with_pred = data.frame(df, predict(lm_fit, interval='prediction'))




p <- ggplot(sbr_with_pred, aes(x=Age, y=SBR)) + 
          geom_point(shape=19, alpha=1/4) +
          geom_smooth(method = 'lm', aes(fill = 'confidence'), alpha = 0.5) +
          geom_ribbon(aes(y = fit, ymin = lwr, ymax = upr, 
                         fill = 'prediction'), alpha = 0.2) + 
          scale_fill_manual('Interval', values = c('green', 'blue')) +
          theme_bw() + 
          theme(legend.position = "none") 


ggsave(filename=paste("/home/data/wolf/FV_DAT/dat_results.png",sep=""))
browseURL(paste("/home/data/wolf/FV_DAT/dat_results.png",sep""))

Essentially, I want to see if the 2 new points fall within the 95% confidence intervals from the normal database (blue ribbon) enter image description here

moadeep
  • 3,988
  • 10
  • 45
  • 72
  • Why all this detail? can you be more concise please. I miss something but you want to add a new layer with a new data.frame? – agstudy Mar 01 '13 at 13:28
  • I want to add 2 new data points and compare these results to a normal data base. Do these points fall within the 95% confidence interval (i.e the normal range) – moadeep Mar 01 '13 at 14:14
  • 2
    I think you just do `p + geom_point(data=newdata,colour="red")` where `newdata` is a data frame in the same format (i.e. matching column names) as your original data set. Please note that your current example is not reproducible ... http://tinyurl.com/reproducible-000 – Ben Bolker Mar 01 '13 at 14:27

1 Answers1

3

Your example is not reproducible. It is really constructive to create data and reproducible example. It is not a waste of time. For the solution, I write what it is said in the comment. You add a new layer with new data.

  newdata <- data.frame(Age = args[1],
                        SBR = c(args[2],args[3]))
  p + geom_point(data=newdata,colour="red",size=10)

For example:

sbr_with_pred <- data.frame(Age = sample(15:36,50,rep=T), SBR = rnorm(50))

p <- ggplot(sbr_with_pred, aes(x=Age, y=SBR)) + 
  geom_point(shape=19, alpha=1/4) +
  geom_smooth(method = 'lm', aes(fill = 'confidence'), alpha = 0.5) 

  args <- c(20,rnorm(1),rnorm(2))
  newdata <- data.frame(Age = args[1],
                        SBR = c(args[2],args[3]))
  p + geom_point(data=newdata,colour="red",size=10)

enter image description here

agstudy
  • 119,832
  • 17
  • 199
  • 261
  • I'm sorry I don't understand. In what sense is the data not reproducible? The sbr_with_pred data_frame is made from previously measured results not from a random sample. The blue band should encompass and area which includes 95% of the data points. – moadeep Mar 01 '13 at 15:16
  • Edit: By normal - I mean these results are measured from patients who do not have the diagnosis we are investigating – moadeep Mar 01 '13 at 15:17
  • It is not reproduciblein the sens I can't reproduce your plot in my machine. Even they are measured data, you can sample them and give us a reproducible example. For more details please see [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – agstudy Mar 01 '13 at 15:46