22

I want to overlay two ggplot2 plots with alpha channels in a way that the resulting image shows both datasets. This is my test data:

data = read.table(text="P1 -1 0 4\nP2 0 0 2\nP3 2 1 8\nP4 -2 -2 6\nP5 0.5 2 12")
data2 = read.table(text="Q1 1 1 3\nQ2 1 -1 2\nQ3 -1 1 8")
colnames(data) = c("name","x","y","score")
colnames(data2) = c("name","x","y","score")

And here is how I plot this data:

ggplot(data, aes(x=x,y=y)) + 
  stat_density2d(data=data,geom="tile", aes(fill = ..density..,alpha=..density..), contour=FALSE) + 
  theme(legend.position="none") + scale_fill_gradient (low = "#FFFFFF", high = "#FF0000") + 
  xlim(-3,3) + ylim(-3,3) + 
  geom_point()

ggplot(data2, aes(x=x,y=y)) + 
  stat_density2d(data=data2,geom="tile", aes(fill = ..density..,alpha=..density..), contour=FALSE) + 
  theme(legend.position="none") + 
  scale_fill_gradient (low = "#FFFFFF", high = "#00FF00") + 
  xlim(-3,3) + ylim(-3,3) + 
  geom_point()

The first plot shows data, the second plot data2:

Plot for dataset *data* Plot for dataset *data2*

I now want a combination of both plots. The following image is what I want to get. I produced it with my image editing program on the desktop by multiplying both images as layers.

Both datasets in one plot

I tried to plot one dataset on top of the other, but that doesn't multiply both layers and the second color overwrites the first one.

ggplot(data, aes(x=x,y=y)) + 
  stat_density2d(data=data,geom="tile", aes(fill = ..density..,alpha=..density..), contour=FALSE) + 
  theme(legend.position="none") + scale_fill_gradient (low = "#FFFFFF", high = "#FF0000") + 
  xlim(-3,3) + ylim(-3,3) + 
  stat_density2d(data=data2,geom="tile", aes(fill = ..density..,alpha=..density..), contour=FALSE) + 
  scale_fill_gradient (low = "#FFFFFF", high = "#00FF00")

enter image description here

Additionally I get this warning: Scale for 'fill' is already present. Adding another scale for 'fill', which will replace the existing scale.

Is there a way to do this in R? Or is there another way (using other functions like eg. smoothScatter) to get this or a similar result? As a kind of workaround I think I'll get a similar result using ImageMagick on the server, but I'd prefer to do it all in R.

Update 1

The multiplication of two layers is done in ImageMagick this way;

composite -compose multiply data-red.png data-green.png im-multiply.png

This gives the same result as shown above.

Update 2

@Roland taught me in his answer how to plot the two datasets within the same plot. While this is great, one problem remains: The image depends on the order you feed the data to the plot.

ggplot(rbind(data.frame(data, group="a"), data.frame(data2, group="b")), aes(x=x,y=y)) + 
  stat_density2d(geom="tile", aes(fill = group, alpha=..density..), contour=FALSE) + 
  scale_fill_manual(values=c("a"="#FF0000", "b"="#00FF00")) + 
  geom_point() + 
  theme_minimal() + 
  xlim(-3.3, 3.3) + ylim(-3.3, 3.3) +
  coord_cartesian(xlim = c(-3.2, 3.2), ylim = c(-3.2, 3.2))

gives this result:

First plot dataset "a" then dataset "b2".

When swapping the order of both datasets (now dataset "b" aka data2 comes first, then dataset data aka "a"), you get a similar result, but now the red color dominates, because it get's plotted later and thus kind of overwrites the green data.

ggplot(rbind(data.frame(data2, group="a"), data.frame(data, group="b")), aes(x=x,y=y)) + 
  stat_density2d(geom="tile", aes(fill = group, alpha=..density..), contour=FALSE) + 
  scale_fill_manual(values=c("b"="#FF0000", "a"="#00FF00")) +
  geom_point() + theme_minimal() + 
  xlim(-3.3, 3.3) + ylim(-3.3, 3.3) + 
  coord_cartesian(xlim = c(-3.2, 3.2), ylim = c(-3.2, 3.2))

enter image description here

I need a solutions that doesn't depend on the order of the datasets.

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104
z80crew
  • 1,150
  • 1
  • 11
  • 20
  • 1
    Unfortunately: ["Currently there can be only one scale per plot (for everything except x and y)."](http://stackoverflow.com/a/3809071/1457051). And, since that's from Hadley, you are most likely truly stuck with generating two plots and converting with ImageMagick. – hrbrmstr Jun 06 '14 at 11:03
  • Thank you for pointing me to this statement. Then I'll go the ImageMagick way. – z80crew Jun 06 '14 at 11:05
  • Re the order of the stacking affecting the final colour, there is also this thread on this: http://www.mail-archive.com/r-help@r-project.org/msg84014.html Best solution would be to calculate the colours of the stack yourself and plot those - here is an example, though not in ggplot2: http://stackoverflow.com/questions/13867782/superimpose-red-green-images-in-r-using-image-or-rasterimage – Tom Wenseleers Aug 09 '15 at 13:44
  • The different result you get depending on the order may also have to do something with the white of the background being mixed in. – Tom Wenseleers Aug 09 '15 at 13:46
  • You could also try to calculate the densities yourself using MASS::kde2d, calculate the appropriate rgb values when stacked and plot those using something like qplot(x, y, data=mydata, fill=rgb, geom="raster") + scale_fill_identity() – Tom Wenseleers Aug 11 '15 at 09:04

2 Answers2

16

Here is exactly the same solution than @Roland, excepting that I suggest controur line. This allow you to appreciate the overlapping. I can't see how geom_tile and your idea of "multiplication" could enable you to appreciate that. Maybe if you use blue and red for none-overlapping area, and a "weighted" violet color for overlapping area. But I guess you would have to compute it in a previous step before ploting I guess.

contour_line

ggplot(rbind(data.frame(data, group="a"), data.frame(data2, group="b")), 
       aes(x=x,y=y)) + 
  stat_density2d(geom="density2d", aes(color = group,alpha=..level..),
                 size=2,
                 contour=TRUE) + 
  #scale_color_manual(values=c("a"="#FF0000", "b"="#00FF00")) +
  geom_point() +
  theme_minimal() +
  xlim(-3.3, 3.3) + ylim(-3.3, 3.3) +
  coord_cartesian(xlim = c(-3.2, 3.2), ylim = c(-3.2, 3.2))
Pierre
  • 568
  • 5
  • 11
  • I guess you meant `scale_color_manual` instead of `scale_fill_manual` – William Zhang Jun 06 '14 at 12:56
  • @WilliamZhang . Yes, thank you. Actullay I copy/paste Roland's solution and did minimal changes. I correct my post right now. – Pierre Jun 06 '14 at 13:52
  • Thank you, this is an interesting visual approach. While it works well for the test data I've shown, I'm not sure this will do for my real data. Nevertheless +1. – z80crew Jun 06 '14 at 13:57
7

You should plot both densities on the same scale:

ggplot(rbind(data.frame(data, group="a"), data.frame(data2, group="b")), 
       aes(x=x,y=y)) + 
  stat_density2d(geom="tile", aes(fill = group, alpha=..density..), 
                 contour=FALSE) + 
  scale_fill_manual(values=c("a"="#FF0000", "b"="#00FF00")) +
  geom_point() +
  theme_minimal() +
  xlim(-3.3, 3.3) + ylim(-3.3, 3.3) +
  coord_cartesian(xlim = c(-3.2, 3.2), ylim = c(-3.2, 3.2))

enter image description here

Otherwise you display a distorted picture of your data.

Roland
  • 127,288
  • 10
  • 191
  • 288
  • This is an enormous improvement on my attempts. Thank you. Are there any chances that the two layers get multiplied? With your code, the latter dataset is visually dominating, hence the green color is brighter than the red one. If I exchange *data* and *data2*, the red color gets brighter. – z80crew Jun 06 '14 at 11:21
  • The local density maxima of group a are higher than those of group b. The plot does reflect that. I have no idea what you mean by "multiplied". – Roland Jun 06 '14 at 11:28
  • But when the maxima of group a (plotted in red) are higher, why is green (group b) brighter? And if you change *data* and *data2* in your code: `rbind(data.frame(data2, group="a"), data.frame(data, group="b")` and `values=c("b"="#FF0000", "a"="#00FF00")` – then red is brighter than green. By "multiplied" I refer to the blend mode for two layers in image editing. This mode is symmetric, so the order of execution doesn't matter. – z80crew Jun 06 '14 at 11:39
  • Sorry, I mixed that up during writing. The maxima of group b are higher. – Roland Jun 06 '14 at 11:55
  • Why should the maxima of group b be higher? And the fact remains, that your solutions depends on the order you feed the data to `ggplot`. The data that is printed later on, kind of overwrites the former. – z80crew Jun 06 '14 at 13:33