0

I want to create a plot in ggplot for a box plot and a line plot in the same figure. I have a data frame which looks like the following:

    Lambda  |  means   | theorMean 
1    0.1      10.07989       10     
2    0.1      10.55681       10     
3    0.1      10.26660       10   
4    0.1      10.29234       10    
5    0.1      10.07754       10      
 ...

The means are sample means and the theoretical means are theorMeans. I want to plot the distribution of the sample means by box plots, while the theoretical means using a straight line.

This is what I have so far ...

library(ggplot2)
library(scales)

p <- ggplot(summ, aes(x=factor(Lambda), y=means)) +
     geom_boxplot() +
     geom_line(data=summ, aes(x=log10(Lambda), y=means))

enter image description here

The problem is that, for a box plot or a violin plot, I need to use the x axis as a factor. On the other hand, I need the x axis to be a number. I basically want to fit a theoretical line, to the box plots I generate. How can I possible do this?

ssm
  • 5,277
  • 1
  • 24
  • 42
  • I'm not sure if I misunderstand, but if the problem is only the class of lambda you can simply remove `aes(x=factor(Lambda), y=means)` from `ggplot()` and place it inside of `geom_boxplot()`. Alternatively, you could use `inherit.aes = FALSE` to block the inheritance of the `aes` in `geom_line`. Please let me know if I've misinterpreted something. – lnNoam Jul 23 '15 at 18:04
  • Umm, I want to line up the two plots. In fact I don't even need the `thermion` in the data.frame. I can simply plot (`lambda` vs. `1/lambda`). Its just that `lambda` is originally log-scaled. So I can't line up the factors which are not log-scaled to a line. I tried your suggestion of `ggplot(summ)+geom_boxplot(aes(x=factor(Lambda), y=means)) + geom_line(aes(x=Lambda, y=means))` and that doesn't help. Thanks for your suggestion though! – ssm Jul 23 '15 at 18:24

1 Answers1

0

This should do the trick:

library(ggplot2)

summ$Lambda <- log10(summ$Lambda)

ggplot(summ, aes(x=factor(Lambda), y=means)) +
  geom_boxplot() +
  geom_line(inheret.aes = FALSE, aes(x=factor(Lambda), y=means, group = 1), color = "blue") +
  ylab("Mean") +
  xlab("Lambda (Log10)") +
  scale_x_discrete(labels = round(summ$Lambda,2)) +
  theme(  axis.ticks.y = element_blank()
        , axis.text.x = element_text(angle = 45, hjust = 1)
  ) 

Yeilds:

enter image description here


Test data:

e <- 2.7182818284590452353602874713527  
summ <- data.frame("Lambda" = seq(0.01, 0.9, by = 0.0287097))

list <- c()
for(x in 1:31){
  t <- e^-(x/10)*15
  list[x] <- t
}
summ$means <- list
summ$Lambda <- log10(summ$Lambda)

This was helpful here.

Community
  • 1
  • 1
lnNoam
  • 1,055
  • 11
  • 20
  • 1
    Thanks very much. I actually used your idea of group and made this: ` theoriticalMeans <- 1/lambdas theorMeans = data.frame( Lambda = lambdas, means = theoriticalMeans ) p <- ggplot() + geom_line(data=theorMeans, aes(x=log10(Lambda), y=means), colour="red", size=1.5) + geom_boxplot(data=summ, aes(x=log10(Lambda), y=means, group=Lambda)) ` – ssm Jul 24 '15 at 13:09