I have a shapfile of school districts in Texas and am trying to use ggplot2
to highlight 10 in particular. I've tinkered with it and gotten everything set up, but when I spot checked it I realized the 10 districts highlighted are not in fact the ones I want to be highlighted.
The shapefile can be downloaded from this link to the Texas Education Agency Public Open Data Site.
#install.packages(c("ggplot2", "rgdal"))
library(ggplot2)
library(rgdal)
#rm(list=ls())
#setwd("path")
# read shapefile
tex <- readOGR(dsn = paste0(getwd(), "/Current_Districts/Current_Districts.shp")
# colors to use and districts to highlight
cols<- c("#CCCCCC", "#003082")
districts <- c("Aldine", "Laredo", "Spring Branch", "United", "Donna", "Brownsville", "Houston", "Bryan", "Galena Park", "San Felipe-Del Rio Cons")
# extract from shapefile data just the name and ID, then subset to only the districts of interest
dist_info <- data.frame(cbind(as.character(tex@data$NAME2), as.character(tex@data$FID)), stringsAsFactors=FALSE)
names(dist_info) <- c("name", "id")
dist_info <- dist_info[dist_info$name %in% districts, ]
# turn shapefile into df
tex_df <- fortify(tex)
# create dummy fill var for if the district is one to be highlighted
tex_df$yes <- as.factor(ifelse(tex_df$id %in% dist_info$id, 1, 0))
# plot the graph
ggplot(data=tex_df) +
geom_polygon(aes(x=long, y=lat, group=group, fill=yes), color="#CCCCCC") +
scale_fill_manual(values=cols) +
theme_void() +
theme(legend.position = "none")
As you'll see, when the plot gets created it looks like it's done exactly what I want. The problem is, those ten districts highlighted are not hte ones in the districts
vector above. I've re-ran everything clean numerous times, double checked that I wasn't having a factor/character conversion issue, and double checked within the web data explorer that the IDs that I get from the shapefile are indeed the ones that should match with my list of names. I really have no idea where this issue could be coming from.
This is my first time working with shapefiles and rgdal
so if I had to guess there's something simple about the structure that I don't understand and hopefully one of you can quickly point it out for me. Thanks!
Here's the output: