4

My goal is to be able to use the geom_density2d() geom to draw contour levels on a scatter plot at user defined locations. Consider the following code:

library(ggplot2)
n = 100
df = data.frame(x = c(rnorm(n, 0, .5), rnorm(n, 3, .5)),
                y = c(rnorm(n, 1, .5), rnorm(n, 0, .5)))

ggplot(df, aes(x = x, y = y)) +
   geom_density2d() +
   geom_point() 

enter image description here

This produces a standard contour plot but there doesn't appear to be a way to manually control which contours get drawn. The optional parameters bins and h in can control the contour lines to some degree (being passed to kde2d from MASS I assume) but the resulting lines do not seem to be interpretable.

Ideally, I would be able to replicate the functionality of plot.kde from the ks library where these can be controlled via that cont argument.

library(ks)
est = kde(df)
plot(est, cont = c(50, 95))

enter image description here

erc
  • 10,113
  • 11
  • 57
  • 88
Bugbee
  • 41
  • 1
  • Perhaps this is helpful: http://stackoverflow.com/questions/23437000/how-to-plot-a-contour-line-showing-where-95-of-values-fall-within-in-r-and-in – Roman Mar 24 '16 at 15:03

1 Answers1

0

This is my naive attempt since I hardly write custom functions. So the following may not be a good approach. At least, the code does the job, though. My trick is to use a data set that ggplot created. First, you draw a graphic and get the data used for the graphic, which you can find from ggplot_build(g)$data[1]. In this, you can find a column called level. Using the column, you can subset data for each contour line. In myfun(), you need to specify which levels you want. The function subsets the data with the specified levels and draw a figure.

setseed(111)
mydf = data.frame(x = c(rnorm(100, 0, .5), rnorm(100, 3, .5)),
                  y = c(rnorm(100, 1, .5), rnorm(100, 0, .5)))

g <- ggplot(mydf, aes(x = x, y = y)) +
            geom_density2d() +
            geom_point()

### Get a list containing data used for drawing g.

temp <- as.data.frame(ggplot_build(g)$data[1])

### Check which levels you have in g

ind <- unique(temp$level)

ind
#[1] 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20


myfun <- function(...){
               ana <- c(...)
               foo <- subset(temp, level %in% ana)

               g <- ggplot() +
               geom_path(data = foo, aes(x = x, y = y, group = group), colour = "red")

               print(g)

               }

### Run myfun by specify levels you want.
myfun(0.02, 0.10, 0.18)

enter image description here

jazzurro
  • 23,179
  • 35
  • 66
  • 76