8

I've been attempting to create a TopoJson file with consolidated layer data containing, among other layers, U.S. States, Counties, and Congressional Districts.

Original .shp shapefiles come from the Census Bureau's Cartographic Boundary Files.

These were converted to GeoJson via ogr2ogr.

Then combined into TopoJson format via the node server side library, with quantization of 1e7 and retain-proportion of 0.15. Up to this point there is no indication of any problem.

I view the final topojson file using mapshaper and things seem to look OK:rendered via mapshaper

But, when attempting to render with the topojson client library and D3.geo.path(), I encounter some strange paths in the congressionalDist layer: (notice the large rectangular paths around the continental US, AK, and HI)square paths

A working version of the page can be found here: http://jsl6906.net/D3/topojson_problem/map/

A couple of relevant notes:

  • If I change my topojson generation script to remove simplification, the paths then seem to show correctly via the same d3.js page
  • If I only keep the congressionalDist layer when creating the topojson, the paths then seem to show correctly via the same d3.js page:

good

After attempting as much troubleshooting as I've been able to handle, I figured I would ask someone here to see if someone has experienced any similar issues. Thanks for any help.

Josh
  • 5,460
  • 31
  • 38
  • 1
    This seems to be/might be related to the problems mentioned in http://stackoverflow.com/questions/23953366/d3-large-geojson-file-does-not-show-draw-map-properly-using-projections/24055015#24055015. There the calculation of the bound went wrong with some of the regions resulting also in large rectangles. In your example, for example, `d3.geo.bounds(cds[84])` results in `[[-180, -90], [180, 90]]' which seems to be incorrect. I do not know why this happens though. – Jan van der Laan Jul 02 '14 at 08:37
  • 1
    Still not sure what's causing this, but one interesting thing I've noticed is that the `id` property of the data bound to the offending rectangles ends in `ZZ`, whereas all other objects have an id ending with two numbers. The id's responsible are: 09ZZ, 17ZZ, and 26ZZ. For example, try the following: `d3.selectAll(d3.selectAll('.cd')[0].filter(function(d) { return d3.select(d).attr('id').slice(-2) === 'ZZ' })).style('stroke', 'red')` and you will notice that only those rectangles are colored red. – jshanley Aug 02 '14 at 03:04
  • It seems `ZZ` is the code given to "undefined" congressional districts. I'm not exactly sure what this means, but you can see it occurring in [this dataset](http://www.census.gov/geo/reference/codes/files/national_cd113.txt) under the column CD113FP, wherever the NAMELSAD column contains "Congressional Districts not defined". Also there is a reference to removing such districts when running ogr2ogr in [**this file**](https://github.com/mbostock/us-atlas/blob/bf502099b48e54116c88f277e6d800836ecbc210/Makefile#L276-L279) which is part of [`us-atlas`](https://github.com/mbostock/us-atlas) – jshanley Aug 02 '14 at 03:17
  • In case this might be useful - here is my complete workflow: 1. Download shapefiles (https://www.census.gov/geo/maps-data/data/tiger-cart-boundary) 2. Convert shapefiles to geoJson (http://jsl6906.net/D3/topojson_problem/3create_geo_jsons.bat.txt) 3. Combine geoJson files into topoJson (http://jsl6906.net/D3/topojson_problem/3create_topo_json.js) – Josh Aug 04 '14 at 18:57
  • Is it possible your paths are "inside out" (counter-clockwise versus clockwise)? What happens if you assign a fill color to Alaska or Hawaii -- does it fill everything in the rectangle *except* the islands/state? See, e.g. http://stackoverflow.com/q/21786168/3128209 – AmeliaBR Aug 07 '14 at 14:58

1 Answers1

4

As I mentioned in the comments, I had noticed that the three offending rectangles all were bound to data with an id property ending in ZZ, while all other paths had IDs ending with numbers.

After some Google searching, I came up with what I think is the answer.

According to this document on the census.gov website,

In Connecticut, Illinois, and Michigan the state participant did not assign the current (113th) congressional districts to cover all of the state or equivalent area. The code “ZZ” has been assigned to areas with no congressional district defined (usually large water bodies). These unassigned areas are treated within state as a single congressional district for purposes of data presentation.

It seems that these three undefined districts would account for the three rectangles. It is unclear at what point in the process they cause the issue, but I believe there is a simple solution to your immediate problem. While searching for information about the ZZ code, I stumbled across this makefile in a project by mbostock called us-atlas.

It seems he had encountered a similar issue and had managed to filter out the undefined congressional districts when running ogr2ogr. Here is the relevant code from that file:

# remove undefined congressional districts
shp/us/congress-ungrouped.shp: shp/us/congress-unfiltered.shp
    rm -f $@
    ogr2ogr -f 'ESRI Shapefile' -where "GEOID NOT LIKE '%ZZ'" $@ $<

I'm betting that if you run your ogr2ogr on your shapefile using the flags shown here it will solve the problem.

jshanley
  • 9,048
  • 2
  • 37
  • 44
  • Interesting, thank you. I will look into this further in the next couple of days. While it does seem that this may be the root of the problem, it does not seem to explain why the paths render fine if I do not combine them with state/county shapefiles, or why if I do not simplify the shapes using topojson, the problem does not exist. Any quick reaction to this? – Josh Aug 02 '14 at 17:45
  • Not at the moment. If you're continuing to investigate the details of it I would suggest you examine what your dataset looks like at each step in the conversion process, paying particular attention to any significant differences in the *kind* of data that represents the undefined districts when compared to the data for other districts. My guess is that after one of the steps, this data will be in a format that d3 cannot render correctly. – jshanley Aug 02 '14 at 19:38
  • Another possibility that occurs to me... When you say that you combine this data with state boundaries etc. is there a step in that process where the shapes or paths themselves are somehow merged into a single shape or path? If so, it's possible that the parts of these undefined districts that are over bodies of water might not be able to be merged with the state or county boundaries if those same bodies of water are used *as* the boundaries in the state/county dataset. – jshanley Aug 02 '14 at 19:48
  • Unfortunately, upon looking into this closer, it does not seem the 'ZZ' paths are what is causing the problem. Thanks for the lead, though. – Josh Aug 04 '14 at 18:56