There are several R
packages that make working with US Census data easier. The two I use most frequently are tigris
(for loading the spatial data) and acs
(for loading the tabular data).
However, one problem I keep running into is that I can't figure out an efficient, reliable way to determine all of the tracts (or block groups, zip codes, etc.) within a Place without leaving the R
console.
For instance, if I wanted to work with census block data in Seattle I would begin by using tigris::tracts
to download the spatial data for King County, WA:
library(tigris)
tr <- tigris::tracts(state = "WA", county = "King")
But unfortunately there's no obvious way to subset this data to include only the tracts in Seattle.
glimpse(tr)
Observations: 398
Variables: 12
$ STATEFP (chr) "53", "53", "53", "53", "53", "53", "53", ...
$ COUNTYFP (chr) "033", "033", "033", "033", "033", "033", ...
$ TRACTCE (chr) "003800", "021500", "032704", "026200", "0...
$ GEOID (chr) "53033003800", "53033021500", "53033032704...
$ NAME (chr) "38", "215", "327.04", "262", "327.03", "3...
$ NAMELSAD (chr) "Census Tract 38", "Census Tract 215", "Ce...
$ MTFCC (chr) "G5020", "G5020", "G5020", "G5020", "G5020...
$ FUNCSTAT (chr) "S", "S", "S", "S", "S", "S", "S", "S", "S...
$ ALAND (dbl) 624606, 3485578, 17160645, 15242622, 10319...
$ AWATER (dbl) 0, 412526, 447367, 526886, 175464, 0, 4360...
$ INTPTLAT (chr) "+47.6794093", "+47.7643848", "+47.4940877...
$ INTPTLON (chr) "-122.2955292", "-122.2737863", "-121.7717...
Similarly, the acs
package allows users to create subsets of census data using the geo.make
function, but in my example this won't help me if I don't already have the list of tracts GEOIDs for all of the Seattle tracts.
For the record, I am aware that it is possible to determine this information elsewhere. This page in the Census.gov FAQ gives clear instructions on how to determine all the tracts in a given census Place. But given that this is a crucial step in many census-related analyses, it would be best if there was a convenient way to do it from the R
console.
Thanks in advance.
Edit
Although this question deals with spatial data, I am most interested in finding a non-spatial solution. For instance, I would prefer to a solution that queries the Census API and returns the returns a vector of the desired GEOIDs to a solution that employs a spatial analysis tool (e.g., rgeos::intersects
) to create the vector. Why? Because spatial approaches are simply more prone to error in this process and this is known information we're talking about, not something that needs to be inferred spatially.