41

This may be a wish list thing, not sure (i.e. maybe there would need to be the creation of geom_pie for this to occur). I saw a map today (LINK) with pie graphs on it as seen here. enter image description here

I don't want to debate the merits of a pie graph, this was more of an exercise of can I do this in ggplot?

I have provided a data set below (loaded from my drop box) that has the mapping data to make a New York State map and some purely fabricated data on racial percentages by county. I have given this racial make up as a merge with the main data set and as a separate data set called key. I also think Bryan Goodrich's response to me in another post (HERE) on centering county names will be helpful to this concept.

How can we make the map above with ggplot2?

A data set and the map without the pie graphs:

load(url("http://dl.dropbox.com/u/61803503/nycounty.RData"))
head(ny); head(key)  #view the data set from my drop box
library(ggplot2)
ggplot(ny, aes(long, lat, group=group)) +  geom_polygon(colour='black', fill=NA)

#  Now how can we plot a pie chart of race on each county 
#  (sizing of the pie would also be controllable via a size 
#  parameter like other `geom_` functions).

Thanks in advance for your ideas.

EDIT: I just saw another case at junkcharts that screams for this type of capability: enter image description here

Community
  • 1
  • 1
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • 5
    Why ggplot2? You can do the map just as easily using base graphics (and maybe the sp package) and then stick the pie charts on top using floating.pie from the plotrix package – Spacedman Apr 28 '12 at 23:20
  • 1
    @Spaceman I am used to mapping in ggplot I suppose. But the nice advantage of ggplot is the access to `facet_grid` that is nice for several chorolopleths at once. – Tyler Rinker Apr 28 '12 at 23:23
  • 1
    It would also be interesting to extend this to little bar graphs, histograms stacked bars etc. (this may already by doable) – Tyler Rinker Apr 28 '12 at 23:37
  • 5
    there's [a draft paper](http://vita.had.co.nz/papers/glyph-maps.pdf) by @hadley with somewhat similar ideas – baptiste Apr 29 '12 at 00:15
  • @baptise This looks promising but may be a ways off. I thought about learning how to make `geoms` (I'm sure you can) but it would need to take data for the percents as well as data for the locations. Could be pretty interesting. – Tyler Rinker Apr 29 '12 at 01:04
  • 1
    In addition to Spacedman's advice - you can see [this related answer](http://stackoverflow.com/a/9233437/604456) by John Colby. IMO these types of maps are frequently better portrayed as a series of small multiple maps (see [this question](http://gis.stackexchange.com/q/4568/751) on the GIS site with a related discussion), but I can appreciate wanting to try to make them! Another similar option would be star charts (or radar charts). They would be less tedious to code up the geometry than pie charts from scratch. – Andy W Apr 29 '12 at 12:15
  • The fact that these types of plots are on junkcharts suggests you may wish to consider simpler ways to plot this data. Consider: load(url("http://dl.dropbox.com/u/61803503/nycounty.RData")) library(ggplot2) library(reshape) ny <- melt(ny, id.vars=1:5) ggplot(ny, aes(long, lat, group=group)) + geom_polygon(colour='black', aes(fill=value)) + facet_wrap(~variable, ncol=2) [Or see also](http://solomonmessing.wordpress.com/2012/03/04/visualization-series-insight-from-cleveland-and-tufte-on-plotting-numeric-data-by-groups/) – Solomon May 23 '12 at 18:57
  • 1
    I agree that there may be better ways that's not really the point of this post. I am not actually plotting this data, I'm looking for a way to do glyphing in ggplot. It's not always the best tool for the job but sometimes it is. There's ton's a glyph types not just pies. check out some of Tufte or Wilkinson's work and you'll see glyphs. ggplot is about giving you the tools and you can best represent your data in a way that makes sense. Wickham says that right in his book, paraphrasing: you can do it in ggplot but it may not make sense. This post was about **how to** not **should you**. – Tyler Rinker May 23 '12 at 20:20
  • 1
    I wonder if the `ggsubplot` package could be used for this as in this blog post: http://blog.revolutionanalytics.com/2012/09/visualize-complex-data-with-subplots.html – Tyler Rinker Dec 05 '12 at 03:13
  • Here is a related question where I used the `ggsubbplot` in the answer. http://stackoverflow.com/questions/16028659/plots-on-a-map-using-ggplot2/16054062#16054062 – Jonas Tundo Apr 17 '13 at 08:46

5 Answers5

35

Three years later this is solved. I've put together a number of processes together and thanks to @Guangchuang Yu's excellent ggtree package this can be done fairly easily. Note that as of (9/3/2015) you need to have version 1.0.18 of ggtree installed but these will eventually trickle down to their respective repositories.

enter image description here

I've used the following resources to make this (the links will give greater detail):

  1. ggtree blog
  2. move ggplot legend
  3. correct ggtree version
  4. centering things in polygons

Here's the code:

load(url("http://dl.dropbox.com/u/61803503/nycounty.RData"))
head(ny); head(key)  #view the data set from my drop box

if (!require("pacman")) install.packages("pacman")
p_load(ggplot2, ggtree, dplyr, tidyr, sp, maps, pipeR, grid, XML, gtable)

getLabelPoint <- function(county) {Polygon(county[c('long', 'lat')])@labpt}

df <- map_data('county', 'new york')                 # NY region county data
centroids <- by(df, df$subregion, getLabelPoint)     # Returns list
centroids <- do.call("rbind.data.frame", centroids)  # Convert to Data Frame
names(centroids) <- c('long', 'lat')                 # Appropriate Header

pops <-  "http://data.newsday.com/long-island/data/census/county-population-estimates-2012/" %>%
     readHTMLTable(which=1) %>%
     tbl_df() %>%
     select(1:2) %>%
     setNames(c("region", "population")) %>%
     mutate(
         population = {as.numeric(gsub("\\D", "", population))},
         region = tolower(gsub("\\s+[Cc]ounty|\\.", "", region)),
         #weight = ((1 - (1/(1 + exp(population/sum(population)))))/11) 
         weight = exp(population/sum(population)),
         weight = sqrt(weight/sum(weight))/3
     )


race_data_long <- add_rownames(centroids, "region") %>>%
    left_join({distinct(select(ny, region:other))}) %>>%
    left_join(pops) %>>%
    (~ race_data) %>>%
    gather(race, prop, white:other) %>%
    split(., .$region)

pies <- setNames(lapply(1:length(race_data_long), function(i){
    ggplot(race_data_long[[i]], aes(x=1, prop, fill=race)) +
        geom_bar(stat="identity", width=1) + 
        coord_polar(theta="y") + 
        theme_tree() + 
        xlab(NULL) + 
        ylab(NULL) + 
        theme_transparent() +
        theme(plot.margin=unit(c(0,0,0,0),"mm"))
}), names(race_data_long))


e1 <- ggplot(race_data_long[[1]], aes(x=1, prop, fill=race)) +
        geom_bar(stat="identity", width=1) + 
        coord_polar(theta="y") 

leg1 <- gtable_filter(ggplot_gtable(ggplot_build(e1)), "guide-box") 


p <- ggplot(ny, aes(long, lat, group=group)) +  
    geom_polygon(colour='black', fill=NA) +
    theme_bw() +
    annotation_custom(grob = leg1, xmin = -77.5, xmax = -78.5, ymin = 44, ymax = 45) 



n <- length(pies)

for (i in 1:n) {

    nms <- names(pies)[i]
    dat <- race_data[which(race_data$region == nms)[1], ]
    p <- subview(p, pies[[i]], x=unlist(dat[["long"]])[1], y=unlist(dat[["lat"]])[1], dat[["weight"]], dat[["weight"]])

}

print(p)
Community
  • 1
  • 1
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
15

This functionality should be in ggplot, I think it is coming to ggplot soonish, but it is currently available in base plots. I thought I would post this just for comparison's sake.

load(url("http://dl.dropbox.com/u/61803503/nycounty.RData"))

library(plotrix)
e=10^-5
myglyff=function(gi) {
floating.pie(mean(gi$long),
             mean(gi$lat),
             x=c(gi[1,"white"]+e,
                 gi[1,"black"]+e,
                 gi[1,"hispanic"]+e,
                 gi[1,"asian"]+e,
                 gi[1,"other"]+e),
              radius=.1) #insert size variable here
}

g1=ny[which(ny$group==1),]
plot(g1$long,
     g1$lat,
     type='l',
     xlim=c(-80,-71.5),
     ylim=c(40.5,45.1))

myglyff(g1)

for(i in 2:62)
  {gi=ny[which(ny$group==i),]
    lines(gi$long,gi$lat)
    myglyff(gi)
  }

Also, there may be (probably are) more elegant ways of doing this in the base graphics.

It's a New York Pie!!

As, you can see, there are quite a few problems with this that need to be solved. A fill color for the counties. The pie charts tend to be too small or overlap. The lat and long do not take a projection so sizes of counties are distorted.

In any event, I am interested in what others can come up with.

Seth
  • 4,745
  • 23
  • 27
  • "This functionality should be in ggplot, I think it is coming to ggplot soonish." -- Does anyone know whether this functionality is now available in ggplot2? All I can find is this SO page, nothing in docs.ggplot2 – Adrian Aug 29 '13 at 14:28
  • The paper cited above and maybe a slide presentation led me to say this. Not sure where it stands now. Not sure if Hadley Wickham comes around here anymore, but he would be the one to ask. @hadley – Seth Aug 29 '13 at 15:03
  • perhaps this is the geom used in hadley's paper: http://docs.ggplot2.org/current/geom_segment.html – marbel Jan 07 '14 at 19:57
6

I've written some code to do this using grid graphics. There is an example here: https://qdrsite.wordpress.com/2016/06/26/pies-on-a-map/

The goal here was to associate the pie charts with specific points on the map, and not necessarily regions. For this particular solution, it is necessary to convert the map coordinates (latitude and longitude) to a (0,1) scale so they can be plotted in the proper locations on the map. The grid package is used to print to the viewport that contains the plot panel.

Code:

# Pies On A Map
# Demonstration script
# By QDR

# Uses NLCD land cover data for different sites in the National Ecological Observatory Network.
# Each site consists of a number of different plots, and each plot has its own land cover classification.
# On a US map, plot a pie chart at the location of each site with the proportion of plots at that site within each land cover class.

# For this demo script, I've hard coded in the color scale, and included the data as a CSV linked from dropbox.

# Custom color scale (taken from the official NLCD legend)
nlcdcolors <- structure(c("#7F7F7F", "#FFB3CC", "#00B200", "#00FFFF", "#006600", "#E5CC99", "#00B2B2", "#FFFF00", "#B2B200", "#80FFCC"), .Names = c("unknown", "cultivatedCrops", "deciduousForest", "emergentHerbaceousWetlands", "evergreenForest", "grasslandHerbaceous", "mixedForest", "pastureHay", "shrubScrub", "woodyWetlands"))

# NLCD data for the NEON plots
nlcdtable_long <- read.csv(file='https://www.dropbox.com/s/x95p4dvoegfspax/demo_nlcdneon.csv?raw=1', row.names=NULL, stringsAsFactors=FALSE)

library(ggplot2)
library(plyr)
library(grid)

# Create a blank state map. The geom_tile() is included because it allows a legend for all the pie charts to be printed, although it does not
statemap <- ggplot(nlcdtable_long, aes(decimalLongitude,decimalLatitude,fill=nlcdClass)) +
geom_tile() +
borders('state', fill='beige') + coord_map() +
scale_x_continuous(limits=c(-125,-65), expand=c(0,0), name = 'Longitude') +
scale_y_continuous(limits=c(25, 50), expand=c(0,0), name = 'Latitude') +
scale_fill_manual(values = nlcdcolors, name = 'NLCD Classification')

# Create a list of ggplot objects. Each one is the pie chart for each site with all labels removed.
pies <- dlply(nlcdtable_long, .(siteID), function(z)
ggplot(z, aes(x=factor(1), y=prop_plots, fill=nlcdClass)) +
geom_bar(stat='identity', width=1) +
coord_polar(theta='y') +
scale_fill_manual(values = nlcdcolors) +
theme(axis.line=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
legend.position="none",
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank()))

# Use the latitude and longitude maxima and minima from the map to calculate the coordinates of each site location on a scale of 0 to 1, within the map panel.
piecoords <- ddply(nlcdtable_long, .(siteID), function(x) with(x, data.frame(
siteID = siteID[1],
x = (decimalLongitude[1]+125)/60,
y = (decimalLatitude[1]-25)/25
)))

# Print the state map.
statemap

# Use a function from the grid package to move into the viewport that contains the plot panel, so that we can plot the individual pies in their correct locations on the map.
downViewport('panel.3-4-3-4')

# Here is the fun part: loop through the pies list. At each iteration, print the ggplot object at the correct location on the viewport. The y coordinate is shifted by half the height of the pie (set at 10% of the height of the map) so that the pie will be centered at the correct coordinate.
for (i in 1:length(pies)) 
  print(pies[[i]], vp=dataViewport(xData=c(-125,-65), yData=c(25,50), clip='off',xscale = c(-125,-65), yscale=c(25,50), x=piecoords$x[i], y=piecoords$y[i]-.06, height=.12, width=.12))

The result looks like this:

map with pies

qdread
  • 3,389
  • 19
  • 36
2

I stumbled upon what looks like a function to do this: "add.pie" in the "mapplots" package.

The example from the package is below.

plot(NA,NA, xlim=c(-1,1), ylim=c(-1,1) )
add.pie(z=rpois(6,10), x=-0.5, y=0.5, radius=0.5)
add.pie(z=rpois(4,10), x=0.5, y=-0.5, radius=0.3)
Conrad
  • 21
  • 1
  • I have tried several options from the answers here and this package was the one that I found easiest to use to quickly reach the point of a map and a pie chart or two on the map! – Amorphia Feb 22 '23 at 17:13
1

A slight variation on the OP's original requirements, but this seems like an appropriate answer/update.

If you want an interactive Google Map, as of googleway v2.6.0 you can add charts inside info_windows of map layers.

see ?googleway::google_charts for documentation and examples

library(googleway)

set_key("GOOGLE_MAP_KEY")

## create some dummy chart data
markerCharts <- data.frame(stop_id = rep(tram_stops$stop_id, each = 3))
markerCharts$variable <- c("yes", "no", "maybe")
markerCharts$value <- sample(1:10, size = nrow(markerCharts), replace = T)

chartList <- list(
  data = markerCharts
  , type = 'pie'
  , options = list(
    title = "my pie"
    , is3D = TRUE
    , height = 240
    , width = 240
    , colors = c('#440154', '#21908C', '#FDE725')
    )
  )

google_map() %>%
  add_markers(
    data = tram_stops
    , id = "stop_id"
    , info_window = chartList
  )

enter image description here

SymbolixAU
  • 25,502
  • 4
  • 67
  • 139