13

I want to merge a SpatialPolygonsDataFrame :

# From https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html
states <- readOGR(dsn = "./cb_2014_us_state_20m.shp",
                  layer = "cb_2014_us_state_20m", verbose = FALSE)

with a normal data frame:

my_counts <- data.frame(
  State = c(
    "CA", "TX", "IL", "FL", "NY", "OH",
    "NJ", "GA", "MI", "PA", "MA", "CO", "AZ", "NC", "VA", "WA", "IN",
    "MD", "MN", "WI", "MO", "TN", "IA", "KY", "LA", "SC", "CT", "AL",
    "KS", "OR", "OK", "AR", "NV", "UT", "NE", "ID", "MS", "DC", "NM",
    "NH", "ME", "AK", "RI", "MT", "HI", "WV", "SD", "ND", "DE", "VT",
    "WY", "PR", "GU", "VI", "MP", "AS", "na", "MH", "FM", "PW"
  ),
  count = c(
    1590533L, 1016328L, 754535L, 742603L, 714205L,
    538719L, 477278L, 452064L, 437162L, 428616L, 420332L, 391084L,
    380853L, 354601L, 342533L, 335505L, 294670L, 286026L, 273427L,
    246172L, 238968L, 236037L, 235030L, 209514L, 199013L, 191707L,
    185521L, 179931L, 163477L, 159862L, 142610L, 136006L, 120111L,
    117338L, 112671L, 106176L, 102564L, 100168L, 97496L, 69881L,
    69508L, 68684L, 65631L, 62109L, 61123L, 57300L, 57254L, 56091L,
    51696L, 33944L, 32136L, 4822L, 598L, 468L, 49L, 19L, 17L,
    11L, 2L, 1L
  )
)

The goal is to use the result to make a map with leaflet

I tried sp::merge

 df1 <- sp::merge(x= states, y=my_counts)

but I get an error:

Error in table(y[, by.y]) : attempt to set an attribute on NULL
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
Ignacio
  • 7,646
  • 16
  • 60
  • 113
  • One more tip (since @bondeddust nailed the answer) is to use `stringsAsFactors=FALSE` in the `readOGR` call _and_ in the `data.frame` creation to avoid potential factor/character issues as you manipulate the data. – hrbrmstr Aug 26 '15 at 04:36

2 Answers2

17

Caveat: I've never done this before so I'm "feeling my way around". First look at the object-states:

Note: this was with rgdal_0.9-3 and sp_1.1-1 loaded under R 3.2.1 (and with GDAL installed on my OSX system, from kingchaos, IIRC):

> str(states)
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
  ..@ data       :'data.frame': 52 obs. of  9 variables:
  .. ..$ STATEFP : Factor w/ 52 levels "01","02","04",..: 5 9 10 11 13 14 16 18 19 21 ...
  .. ..$ STATENS : Factor w/ 52 levels "00068085","00294478",..: 22 17 2 18 27 28 29 30 16 19 ...
  .. ..$ AFFGEOID: Factor w/ 52 levels "0400000US01",..: 5 9 10 11 13 14 16 18 19 21 ...
  .. ..$ GEOID   : Factor w/ 52 levels "01","02","04",..: 5 9 10 11 13 14 16 18 19 21 ...
  .. ..$ STUSPS  : Factor w/ 52 levels "AK","AL","AR",..: 5 8 10 11 14 15 13 18 19 21 ...
  .. ..$ NAME    : Factor w/ 52 levels "Alabama","Alaska",..: 5 9 10 11 13 14 16 18 19 21 ...
  .. ..$ LSAD    : Factor w/ 1 level "00": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ ALAND   : num [1:52] 4.03e+11 1.58e+08 1.39e+11 1.49e+11 2.14e+11 ...
  .. ..$ AWATER  : num [1:52] 2.05e+10 1.86e+07 3.14e+10 4.95e+09 2.40e+09 ...
  ..@ polygons   :List of 52
  .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
  .. .. .. ..@ Polygons :List of 6
  .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
  .. .. .. .. .. .. ..@ labpt  : num [1:2] -118.4 33.4
  .. .. .. .. .. .. ..@ area   : num 0.0259
  .. .. .. .. .. .. ..@ hole   : logi FALSE
#####   Snipped rest of output ............................

So after looking for help on merge and reading:

 ?merge   # and choosing the option for:

Merge a Spatial* object having attributes with a data.frame
(in package sp in library /Library/Frameworks/R.framework/Versions/3.2/Resources/library)

I decided to try (and appear to have succeeded:

> newobj <- merge(states, my_counts, by.x="STUSPS", by.y="State")
Warning message:
In .local(x, y, ...) : 8 records in y cannot be matched to x

> names(newobj@data)
 [1] "STUSPS"   "STATEFP"  "STATENS"  "AFFGEOID" "GEOID"    "NAME"    
 [7] "LSAD"     "ALAND"    "AWATER"   "count"   

The warning makes sense. You seem to have some extra "States" not anticipated by the authors of that "states" shp-file:

> length( table(my_counts$State))
[1] 60
> length( unique(states@data$STUSPS) )
[1] 52

The moral

You should look at the names-values in the two objects when you are merging:

> names(states)
[1] "STATEFP"  "STATENS"  "AFFGEOID" "GEOID"    "STUSPS"   "NAME"     "LSAD"    
[8] "ALAND"    "AWATER"  

> names(my_counts)
[1] "State" "count"
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • You can also work with the `@data` slot directly (not recommended unless one knows what they're doing) and the real key for this procedure is to also not mess with the order of the rows OR the rownames. – hrbrmstr Aug 26 '15 at 04:35
  • Thanks for this answer! I was expecting this to be much more complicated. – user2813606 Oct 28 '20 at 01:25
0

maybe you should add the argument "incomparable" as in the example:

"merge(x, y, by=intersect(names(x), names(y)),

by.x=by, by.y=by, all.x=TRUE, suffixes = c(".x",".y"), incomparables=NULL, ...)"