26

I've been trying to use ggplot2 to produce a plot similar to this R graphic:

xv<-seq(0,4,0.01)
yv<-dnorm(xv,2,0.5) 
plot(xv,yv,type="l") 
polygon(c(xv[xv<=1.5],1.5),c(yv[xv<=1.5],yv[xv==0]),col="grey") 

This is as far as I've gotten with ggplot2:

x<-seq(0.0,0.1699,0.0001)   
ytop<-dnorm(0.12,0.08,0.02)
MyDF<-data.frame(x=x,y=dnorm(x,0.08,0.02))
p<-qplot(x=MyDF$x,y=MyDF$y,geom="line") 
p+geom_segment(aes(x=0.12,y=0,xend=0.12,yend=ytop))

I would like to shade the tail region beyond x=0.12. How would I do this using ggplot or qplot?

Broadly, how does one shade any subset under the curve, whether a tail, or between two arbitrary lines dividing the region into distinct areas?

Thanks for any advice.

Matt
  • 74,352
  • 26
  • 153
  • 180
Tim Erickson
  • 263
  • 1
  • 3
  • 5

3 Answers3

22

Create a polygon with the area you want to shade

#First subst the data and add the coordinates to make it shade to y = 0
shade <- rbind(c(0.12,0), subset(MyDF, x > 0.12), c(MyDF[nrow(MyDF), "X"], 0))

#Then use this new data.frame with geom_polygon
 p + geom_segment(aes(x=0.12,y=0,xend=0.12,yend=ytop)) +
     geom_polygon(data = shade, aes(x, y))

enter image description here

Luciano Selzer
  • 9,806
  • 3
  • 42
  • 40
  • The answer helps in another way too. I was not thinking in the mode of ggplot2 and creating an explicit data subset. I was trying to make this work from a purely graphical object point of view. – Tim Erickson Sep 14 '12 at 22:40
  • I think you meant (note the location of the last closing parenthesis)... shade <- rbind(c(0.12,0), subset(MyDF, x > 0.12), c(MyDF[nrow(MyDF), "X"], 0)) – ceiling cat May 30 '13 at 09:21
3

This is essentially a copy of Luciano's answer which I found useful, however it may save some time for others wanting to use this approach.

Create the data. Here the density at 0.001 intervals from the 0.1th percentile to the 99.9th percentile of a normal distribution with specified mean and sd.

mean_ = 10
sd_ = 4

x = seq(qnorm(c(0.001), mean_, sd_),qnorm(c(0.999), mean_, sd_),0.001) 

distdata = data.frame(x=x,y=dnorm(x,mean_,sd_))

A function for shading left or right tails from specific values.

shade_under_curve = function(p, d, left=NULL, right=NULL, distrib, fill, ...){

  if(!is.null(left)){

    shade = rbind(c(d[1, "x"], 0), d[d$x<left,], c(left,0))

  } else if(!is.null(right)){

    shade = rbind(c(right,0), d[d$x>right,], c(d[nrow(d), "x"], 0))

  }

  value = c(left, right)

  ytop<-distrib(value,...)

  p + geom_segment(aes(x=value,y=0,xend=value,yend=ytop)) +
    geom_polygon(data = shade, aes(x, y), alpha=0.2, fill=fill) 


}

Examples:

p = qplot(x=distdata$x,y=distdata$y,geom="line") 

shade_under_curve(p, distdata, left=3, distrib=dnorm, mean=mean_, sd=sd_, fill = "red") 

shade_under_curve(p, distdata, right=15, distrib=dnorm, mean=mean_, sd=sd_, fill = "blue")

p2 = shade_under_curve(p, distdata, left=qnorm(0.025, mean_, sd_), distrib=dnorm, mean=mean_, sd=sd_, fill = "green") 
shade_under_curve(p2, distdata, right=qnorm(0.975, mean_, sd_), distrib=dnorm, mean=mean_, sd=sd_, fill = "green") 
Adam Waring
  • 1,158
  • 8
  • 20
1

A newer approach to tail-shading using the layer-blending capability of the ggfx package.

The benefit here is that you don't need to supply information about the curve, only the x-axis limits. Although it's actually a blue "rect" geom, the colour is only retained where it lies atop the orange density layer.

library(tidyverse)
library(ggfx)

tibble(outcome = rnorm(10000, 20, 2)) |> 
  ggplot(aes(outcome)) +
  as_reference(geom_density(adjust = 2, fill = "orange"), id = "density") +
  with_blend(annotate("rect", xmin = 15, xmax = 18, ymin = -Inf, ymax = Inf,
           fill = "blue"), bg_layer = "density", blend_type = "atop")

Created on 2022-04-17 by the reprex package (v2.0.1)

Carl
  • 4,232
  • 2
  • 12
  • 24