31

I have a list of locations that contains a city, state, zip, latitude and longitude for each location.

I separately have a list of economic indicators at the county level. I've played with the zipcode package, the ggmap package, and several other free geocoding websites including the US Gazeteer files, but can't seem to find a way to match the two pieces.

Are there currently any packages or other sources that do this?

screechOwl
  • 27,310
  • 61
  • 158
  • 267
  • Is this any help? http://gis.stackexchange.com/questions/28035/how-to-convert-zip-codes-to-latitude-and-longitude – thelatemail Nov 09 '12 at 21:30
  • 2
    For the lat/long data, you could likely adapt the code I supplied [here](http://stackoverflow.com/questions/8751497/latitude-longitude-coordinates-to-state-code-in-r/8751965#8751965), in response to a question about how to convert lat/long coordinates to state codes. – Josh O'Brien Nov 09 '12 at 22:15
  • @JoshO'Brien: it looks like it may work. It looks like they have a map of all counties in the US. I'll have to code it up and see. Thanks for the suggestion. – screechOwl Nov 09 '12 at 23:13
  • 1
    @JoshO'Brien: nice idea. I just changed 'state' -> 'county' and it worked perfectly. Thank you very much. – screechOwl Nov 09 '12 at 23:18
  • @JoshO'Brien: you want to throw that in an answer and I'll choose it? – screechOwl Nov 20 '12 at 02:50
  • 1
    Oh, that's very kind of you, but no thanks. I'm just happy to learn that the approach works as well with the **maps** county database as it does with the states. It would, though be *very* appropriate for you to post what you used and then accept it in a couple of days. I'd certainly upvote a real answer, and it would likely be helpful for future searchers. Cheers. – Josh O'Brien Nov 20 '12 at 03:14

4 Answers4

24

I ended up using the suggestion from JoshO'Brien mentioned above and found here.

I took his code and changed state to county as shown here:

library(sp)
library(maps)
library(maptools)

# The single argument to this function, pointsDF, is a data.frame in which:
#   - column 1 contains the longitude in degrees (negative in the US)
#   - column 2 contains the latitude in degrees

latlong2county <- function(pointsDF) {
    # Prepare SpatialPolygons object with one SpatialPolygon
    # per county
    counties <- map('county', fill=TRUE, col="transparent", plot=FALSE)
    IDs <- sapply(strsplit(counties$names, ":"), function(x) x[1])
    counties_sp <- map2SpatialPolygons(counties, IDs=IDs,
                     proj4string=CRS("+proj=longlat +datum=WGS84"))

    # Convert pointsDF to a SpatialPoints object 
    pointsSP <- SpatialPoints(pointsDF, 
                    proj4string=CRS("+proj=longlat +datum=WGS84"))

    # Use 'over' to get _indices_ of the Polygons object containing each point 
    indices <- over(pointsSP, counties_sp)

    # Return the county names of the Polygons object containing each point
    countyNames <- sapply(counties_sp@polygons, function(x) x@ID)
    countyNames[indices]
}

# Test the function using points in Wisconsin and Oregon.
testPoints <- data.frame(x = c(-90, -120), y = c(44, 44))

latlong2county(testPoints)
[1] "wisconsin,juneau" "oregon,crook" # IT WORKS
Neil L
  • 3
  • 2
screechOwl
  • 27,310
  • 61
  • 158
  • 267
  • 2
    Hi, how can I modify this piece of code to convert lat-long to FIPS codes? Thank you. – Geekuna Matata Apr 14 '14 at 20:08
  • 1
    This code gave me an 'unknown elliptical parameter' error but worked for me when removing the proj4string in SpatialPoints and map2SpatialPolygons. Maybe an update to the package? Thank you for writing this! – bpheazye Sep 19 '18 at 13:39
  • Any idea why some places aren't covered. e.g. Calexico CA: testPoints <- data.frame(y =32.67138 , x = -115.5031) latlong2county(testPoints) – MatthewR Jul 14 '22 at 18:50
10

Matching Zipcodes to Counties is difficult. (Certain zip codes span more than one county and sometimes more than one state. For example 30165)

I am not aware of any specific R package that can match these up for you.

However, you can get a nice table from the Missouri Census Data Center.
You can use this page for data extraction.

A sample output might look like:

state,zcta5,ZIPName,County,County2
01,30165,"Rome, GA",Cherokee AL,
01,31905,"Fort Benning, GA",Russell AL,
01,35004,"Moody, AL",St. Clair AL,
01,35005,"Adamsville, AL",Jefferson AL,
01,35006,"Adger, AL",Jefferson AL,Walker AL
...

Note the County2. metadata explanation can be found here.

county 
The county in which the ZCTA is all or mostly contained. Over 90% of ZCTAs fall entirely within a single county.

county2 
The "secondary" county for the ZCTA, i.e. the county which has the 2nd largest intersection with it. Over 90% of the time this value will be blank.

See also ANSI County codes http://www.census.gov/geo/www/ansi/ansi.html

General Grievance
  • 4,555
  • 31
  • 31
  • 45
Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
7

I think the package "noncensus" is helpful.

corresponding is what I use to match zipcode with county

### code for get county based on zipcode

library(noncensus)
data(zip_codes)
data(counties)

state_fips  = as.numeric(as.character(counties$state_fips))
county_fips = as.numeric(as.character(counties$county_fips))    
counties$fips = state_fips*1000+county_fips    
zip_codes$fips =  as.numeric(as.character(zip_codes$fips))

# test
temp = subset(zip_codes, zip == "30329")    
subset(counties, fips == temp$fips)
Fang Terry
  • 71
  • 1
  • 1
  • 2
    Your first threes lines, the construction of `counties$fips`, could be obtained in a single line with `counties$fips <- interaction(counties$state_fips, counties$county_fips)`. – lmo Jan 23 '17 at 21:08
  • 3
    Looks like noncensus was removed from CRAN 1/16/2020 – ChristinaP Apr 30 '20 at 20:19
4

A simple option is to use the geocode() function in ggmap, with the option output="more" or output="all.

This can take flexible input, such as the address or lat/lon, and returns Address, city, county, state, country, postal code, etc, as a list.

require("ggmap")
address <- geocode("Yankee Stadium", output="more")

str(address)
$ lon                        : num -73.9
$ lat                        : num 40.8
$ type                       : Factor w/ 1 level "stadium": 1
$ loctype                    : Factor w/ 1 level "approximate": 1
$ address                    : Factor w/ 1 level "yankee stadium, 1 east 161st street, bronx, ny 10451, usa": 1
$ north                      : num 40.8
$ south                      : num 40.8
$ east                       : num -73.9
$ west                       : num -73.9
$ postal_code                : chr "10451"
$ country                    : chr "united states"
$ administrative_area_level_2: chr "bronx"
$ administrative_area_level_1: chr "ny"
$ locality                   : chr "new york"
$ street                     : chr "east 161st street"
$ streetNo                   : num 1
$ point_of_interest          : chr "yankee stadium"
$ query                      : chr "Yankee Stadium"

Another solution is to use a census shapefile, and the same over() command from the question. I ran into a problem using the maptools base map: because it uses the WGS84 datum, in North America, points that were within a few miles of the coast were mapped incorrectly and about 5% of my data set did not match up.

try this, using the sp package and Census TIGERLine shape files

counties <- readShapeSpatial("maps/tl_2013_us_county.shp", proj4string=CRS("+proj=longlat +datum=NAD83"))

# Convert pointsDF to a SpatialPoints object 
pointsSP <- SpatialPoints(pointsDF, proj4string=CRS("+proj=longlat +datum=NAD83"))

countynames <- over(pointsSP, counties)
countynames <- countynames$NAMELSAD
ano
  • 704
  • 5
  • 15