22

I'm trying to build a map in ggplot2 using data from separate data frames.

library(maptools)

xx <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1], IDvar="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))

xx.sub1 <- subset(xx, xx$FIPSNO < 37010)
xx.sub2 <- subset(xx, xx$FIPSNO > 37010)

xx.sub1@data$id <- rownames(xx.sub1@data)
xx.sub1.points <- fortify(xx.sub1, region="id")
xx.sub1.df = plyr::join(xx.sub1.points, xx.sub1@data, by="id")

xx.sub2@data$id <- rownames(xx.sub2@data)
xx.sub2.points <- fortify(xx.sub2, region="id")
xx.sub2.df = plyr::join(xx.sub2.points, xx.sub2@data, by="id")

ggplot(xx.sub2.df) + 
  aes(long, lat, fill = (SID79/BIR79)*1000, group = group) + 
  geom_polygon() + geom_path(color="grey80") +
  coord_equal() + 
  scale_fill_gradientn(colours = RColorBrewer::brewer.pal(7, "YlOrBr")) +
  geom_polygon(data = xx.sub1.df, fill = "grey50") + 
  geom_path(data = xx.sub1.df, color="grey80") +
  labs(fill = "Mapped value", title = "Title")

Up to this point everything works as expected and I get a nice map:

enter image description here

What I'd like to change however is to add separate legend for data from xx.sub1.df - since all polygons are just filled with grey I hope it will be one additional entry.

How can I achieve that?

Matifou
  • 7,968
  • 3
  • 47
  • 52
radek
  • 7,240
  • 8
  • 58
  • 83
  • 4
    reproducible example ( http://tinyurl.com/reproducible-000 ) please? The canonical way to solve this problem is to merge the data sets, including a factor variable identifying which original data frame each data set came from, then use an aesthetic (in your case for fill, I think) ... you might look at the `scales` package to see if there's another way – Ben Bolker May 05 '13 at 23:02
  • @BenBolker Roger that. Example added. I'm aware that it would be way more easier to have everything in one df. However, I often work with different layers of data (might be my bias of coming from GIS background) that would be a pain in the neck to join. And in this particular example I need to select few polygons and 'highlight' or 'mask' them in a quick way. – radek May 07 '13 at 07:58
  • Could you please add a `dput` of your data, so that one can answer your question with an updated heatmap? I'm guessing that: (1)you use only 2 columns of `xx.sub2`, (2) states appear grey if they are present in `xx.sub1`. Hence joining doesn't seem that annoying. You could simply add a factor in `xx.sub2` for entries that are in `xx.sub1`, and perhaps use `scale_fill_manual` to adjust colours in the legend. – G Chalancon May 12 '13 at 08:18
  • @GChalancon I'm using example data from `maptools` package, which sI hope allows reproducibility of an example (I believe `dput` is not needed any longer?). As for points 1 and 2 - all valid options for the toy example. For real scenarios however, datasets are more complex and your solution would be harder to implement. Hence, I'd love to be able to achieve that without the joins, working with 'independent' data frames. – radek May 13 '13 at 11:02
  • 1
    Oh, I didn't try to load the data before, my mistake. The major difficulty is that you want to map two distinct scales that use the same type of aesthetic (`geom_polygon` for both data frames). As far as I know it's not possible in ggplot2, but here is a suggestion: how about using a different annotation (e.g. a `geom_text`) to mark the 'grey' regions? It would give you 2 legends, but I understand it might not be that satisfactory. One thing: is it possible that in a real case regions of `xx.sub1.df` would overlap with `xx.sub2.df`? – G Chalancon May 13 '13 at 15:32
  • @GChalancon Thanks for update. As long as the legend includes the greyed polygons I'm completely open to method. `geom_text` might actually be a good idea so if you manage to hack something using this method - let me know! – radek May 13 '13 at 17:09

1 Answers1

32

I'm not 100% sure this is what you want, but here's how I'd approach the problem as I understand it. If we map some unused geom with any data from xx.sub1.df, but make it invisible on the plot, we can still get a legend for that geom. Here I've used geom_point, but you could make it others.

p <- ggplot(xx.sub2.df) + 
  aes(long, lat, fill = (SID79/BIR79)*1000, group = group) + 
  geom_polygon() + geom_path(color="grey80") +
  coord_equal() + 
  scale_fill_gradientn(colours = brewer.pal(7, "YlOrBr")) +
  geom_polygon(data = xx.sub1.df, fill = "grey50") + 
  geom_path(data = xx.sub1.df, color="grey80") +
  labs(fill = "Mapped value", title = "Title")

#Now we add geom_point() setting shape as NA, but the colour as "grey50", so the 
#legend will be displaying the right colour

p2 <- p + geom_point(data = xx.sub1.df, aes(size="xx.sub1", shape = NA), colour = "grey50")

enter image description here

Now we just need to alter the size and shape of the point on the legend, and change the name of the legend (thanks to @DizisElferts who demonstrated this earlier).

p2 + guides(size=guide_legend("Source", override.aes=list(shape=15, size = 10)))

enter image description here

Of course you can change the way the labels work or whatever to highlight what you want to show.

If this isn't what you're after, let me know!

Community
  • 1
  • 1
alexwhan
  • 15,636
  • 5
  • 52
  • 66
  • That's nice! It seemed to me that they were no other workaround than adding a new aesthetic (here `geom_point`) for which a legend can be added. I didn't know about `shape=NA`, so that's really useful. – G Chalancon May 14 '13 at 14:25