0

I have data on transplant cases for entire united states with about 208 zip codes which are for transplant centers. I also have their latitude and longitude information. I am trying to perform spatial survival analysis to see spatial effect of the transplant centers. I have this code and I don't know how to create adj.mtx for proximity.

survregbayes(formula=Surv(time,cens)~age+sex + +frailtyprior("car",transplantcenter),data=d,survmodel="PO", + dist="loglogistic",mcmc=mcmc,prior=prior,Proximity=adj.mtx)

Here is what my data look like:

Ryan Warnick
  • 1,079
  • 2
  • 10
  • 15
todlerR
  • 11
  • 1
  • 1
    I tried to answer the question, but the question itself needs some editing in it's formatting and grammar so that what you are asking is more clear. – Ryan Warnick Dec 03 '17 at 08:33

2 Answers2

3

I don't think the actual zip codes are useful as numeric adjacency does not necessarily correspond to physical adjacency. I think what you're asking is how to get a neighborhood matrix for zip codes, but to do that I believe you'd need more information. For example, you could calculate the distances from the centers of each zip code (the given latitudes and longitudes) and threshold them, but some higher density areas will have smaller distances due to the relative size of the zip codes.

Assuming you wanted to do that you can use the geosphere package. This package has a function call distHaversine which takes as arguments two points, and an optional spherical radius argument, and gives you the distance between the points taking into account the curvature of the earth and everything.

All of that said, a better approach is probably to download data on the boundaries of zip codes and look at where boundaries intersect. The most up to date data I could find (for free) is here; which is pretty old. To access it click the drop down menu and select Zip Code Tabulation Areas. The file is about 500 mb.

The unzipped data has a .shp file, and this post has a good tutorial for working with shape data. Blatantly taking some stuff from that guy's post, the following code gets the shape files into memory:

# Install dependencies
install.packages("rgeos")
install.packages("maptools")
library(rgeos)
library(maptools)
# Define the projection to be used
crswgs84=CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
# Load the shapes
postal.codes=readShapePoly("tl_2013_us_zcta510.shp",proj4string=crswgs84,verbose=TRUE)

Following along with this answer, in the rgeos package there is a function called gTouches which constructs an adjacency matrix from polygon shape boundaries.

# Get adjacency matrix, returnDense=FALSE is to get a sparse matrix for memory purposes
adj.mat <- gTouches(postal.codes, byid=TRUE, returnDense=FALSE)

Running that code for all the zip codes can be very time consuming (and throws some errors possibly due to extreme differences in location), but to access only the ones you are interested in you can do row indexing on the shape object. For example, getting the adjacency for the first 200 zip codes is done by doing:

adj.mat.sub <- gTouches(postal.codes[1:200,], byid=TRUE, returnDense=TRUE)

The actual zip codes themselves are in a data frame within the shape object. You can access them by doing:

postal.codes@data$ZCTA5CE10

You can use these to obtain the indices for zip codes present in your own data, and more quickly compute the adjacency matrix on that smaller, relevant, subset. All of this will require some bookkeeping with the census data and your own data frame.

There are probably more current zip code geometries available for purchase (or maybe free) out there, but the census data is a good combination of available and easy to work with.

Ryan Warnick
  • 1,079
  • 2
  • 10
  • 15
2

I agree wiht Ryan Warnick that you probably need polygons. However, you should not use maptools::readShapePoly. That is an outdated and incomplete function.

Use either functions from the sf package (sf_read) or from rgdal (in this case made easier via the raster package)

library(raster)
library(rgdal)

postal.codes <- shapefile("tl_2013_us_zcta510.shp")

You can try to roll your own adjacency matrix, as suggested by Ryan Warnick, but there are well established functions for that. Notably in the spdep package.

library(spdep)
nb <- poly2nb(postal.codes)

You can transform nb with functions such as spdep::nb2mat and spdep::sp2listw

P.S. there are several distance functions in geosphere, but distGeo is the most accurate.

Robert Hijmans
  • 40,301
  • 4
  • 55
  • 63
  • Upvoted. I kind of figured there were better ways to do it. I just wanted to get across that you would probably need to access some external data set. – Ryan Warnick Dec 03 '17 at 22:22