3

I have a very simple operation I would like to perform: combining two shape files. Specifically, I have census tract shape files for each state in the US that I would like to combine into one shape file. Ultimately, I want to take the combined shape file and perform an overlay on a set of latitude and longitude coordinates, to determine which census tracts my coordinates fall into.

I have seen much discussion on this (Combining bordering shapefiles in R). However, all of the discussion are outdated and I am hoping that the packages have been improved in the meantime.

The files I am using are located here: ftp://ftp2.census.gov/geo/tiger/TIGER2010/TRACT/2010/

The code below can be used to recreate the files I am working with. It requires downloading two files, approximately 11 megabytes total. The run time should be only a minute.

Any help is greatly appreciated. This seems like a trivial operation to perform. Perhaps if I had more experience with geo-spatial mapping I could make better use of the documentation available.

Here are a few things I have tried:

### Insert your file path here
FPATH <- './data'

### Set up library
require(rgeos)
require(maptools)
require(RCurl)
require(parallel)
cl <- makeCluster(detectCores())

### Download files... (~11,000 KB total for this example)
ftp <- 'ftp://ftp2.census.gov/geo/tiger/TIGER2010/TRACT/2010/'
files <- getURLContent(ftp, dirlistonly = T)
files <- unlist(strsplit(files, split = '\r\n', fixed = T))
files <- grep('2010_[[:digit:]]{2}_', files, value = T)[1:2]  # Only grab two files for this example
clusterMap(cl, download.file, url = paste0(ftp, files), destfile = paste0(FPATH, files))

### Unzip shape files
files <- list.files(FPATH, full.names = T)
clusterMap(cl, unzip, zipfile = files, exdir = FPATH)

### Read in shape files
files <- list.files(FPATH, pattern = "shp$", full.names = T)
a <- readShapePoly(fn = files[1])
b <- readShapePoly(fn = files[2])

### Attempt to join two shape files
spRbind(a, b)  # Error in spRbind(as(obj, "SpatialPolygons"), as(x, "SpatialPolygons")) : non-unique polygon IDs
gUnion(a, b)   # Error in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, "rgeos_union")   : geos_geospolygon2SpatialPolygons: ng > length(IDs)

Thank you for your time.

Community
  • 1
  • 1
Andreas
  • 1,923
  • 19
  • 24

1 Answers1

5

You've got duplicated polygon IDs. You can change them manually, or use taRifx.geo:

library(devtools)
install_git("git://github.com/gsk3/taRifx.geo.git")
library(taRifx.geo)
rbind(a,b, fix.duplicated.IDs=TRUE)

Code available for inspection here.

Ari B. Friedman
  • 71,271
  • 35
  • 175
  • 235
  • Ari, thank you so much for your quick response, as well as for developing the package; you're awesome! I also appreciate the link to the source code. This function is exactly what I am looking for. I had considered changing polygon ID's manually but wasn't sure if this would be okay. – Andreas Nov 13 '13 at 19:37
  • My pleasure. Please let me know if you find any bugs--I wrote that one in a hurry! – Ari B. Friedman Nov 13 '13 at 21:28
  • Not a bug, but the function throws an error when I use it with `do.call(rbind, obj)`. Looking at `do.call` documentation I see that some functions which use `substitute` may be evaluated in different ways which are "currently undefined and subject to change". What is the cleanest way to feed a list of inputs to your function? – Andreas Nov 13 '13 at 23:11
  • 1
    @Andreas Interesting. I guess you could modify the first line so that it just grabbed the first element of `...`, checked if that was a list, and assigned that to `dots`. – Ari B. Friedman Nov 14 '13 at 00:16
  • 1
    Very helpful, thank you. Notes for others: 1) this is on CRAN now, so you can `install.packages('taRifx.geo')`. 2) `a` and `b` must have the same names, otherwise the command fails. Running `names(b) <- names(a)` was all I needed to do, then `rbind` worked great (this may or may not be appropriate for your data, but was for me) – arvi1000 Feb 01 '16 at 22:07