1

I'm trying to overlay geocoded citizen complaints onto a map of Chicago using ggplot. Due to overplotting, it seems like stat_density2d is a good visualization option.

I am able to generate the image, but for some reason the edge of the density plot gets clipped at the top. I wondered whether this was because data cuts off at the city limits, but this clipping only seems to happen at the northern edge.

Here's the image I have so far

I have tried to extend the map limits using coord_sf(ylim = c(41.5,42.1)), coord_sf(expansion = add(1)), and scale_y_continuous() variations, but they only alter the larger frame, not the density estimation itself. I've tried swapping out the drawing order, as well.

Below is the code that's generating the clipped map. Any ideas would be much appreciated!


Complaint data: Click 'export' at the top right hand corner of this page and choose 'filtered data'

Randomly selected reproducible sample of 30 requests:

structure(list(SR_NUMBER = c("SR22-00811165", "SR22-01164780", 
"SR22-01508217", "SR22-01755183", "SR22-01933678", "SR22-00247415", 
"SR22-00248335", "SR22-02151334", "SR22-01733700", "SR22-01975593", 
"SR22-01053765", "SR22-01944033", "SR22-00320606", "SR22-01980637", 
"SR22-01880622", "SR22-01384246", "SR22-01928371", "SR22-00194667", 
"SR22-00756205", "SR22-01739504", "SR22-01125040", "SR22-01909737", 
"SR22-01489811", "SR22-01446492", "SR22-01490346", "SR22-00831619", 
"SR22-01304939", "SR22-00529288", "SR22-01199406", "SR22-01646390"
), LATITUDE = c(41.963223205, 41.851106221, 41.855196317, 41.693009531, 
41.848866001, 41.777333685, 41.793388436, 41.934834001, 41.804204287, 
41.960376001, 41.720554931, 41.907222001, 41.775464943, 41.837652001, 
41.968710672, 41.83717756, 41.784786001, 41.938766477, 41.855461237, 
41.808925011, 41.928731684, 41.969616378, 41.831782167, 41.742884267, 
41.915204432, 41.948663163, 41.819584417, 41.932760733, 41.934298928, 
41.891709425), LONGITUDE = c(-87.660579619, -87.673965116, -87.709937466, 
-87.7097159, -87.7170735, -87.791212511, -87.668443344, -87.6591135, 
-87.668222591, -87.7592655, -87.535027818, -87.6647295, -87.764238868, 
-87.6308175, -87.651489314, -87.719711511, -87.6243105, -87.713442426, 
-87.659063397, -87.704466108, -87.793867661, -87.800404373, -87.652121052, 
-87.654461691, -87.697921786, -87.645079844, -87.601299951, -87.662822873, 
-87.70164214, -87.77186285), LOCATION = c("(41.963223204954666, -87.66057961936015)", 
"(41.85110622124882, -87.67396511646133)", "(41.85519631667924, -87.70993746578567)", 
"(41.69300953133947, -87.70971590015571)", "(41.84886600094033, -87.71707350000189)", 
"(41.777333685235796, -87.79121251067474)", "(41.79338843610715, -87.66844334382944)", 
"(41.93483400094065, -87.65911350000192)", "(41.804204287308636, -87.66822259059123)", 
"(41.960376000940705, -87.75926550000192)", "(41.720554931383866, -87.53502781763595)", 
"(41.90722200094053, -87.6647295000019)", "(41.775464942875715, -87.76423886843776)", 
"(41.837652000940295, -87.6308175000019)", "(41.968710672022326, -87.65148931416674)", 
"(41.83717755952037, -87.71971151062469)", "(41.784786000940116, -87.6243105000019)", 
"(41.938766477326084, -87.71344242628473)", "(41.85546123655601, -87.65906339687139)", 
"(41.808925011186346, -87.70446610808264)", "(41.928731683787674, -87.79386766069565)", 
"(41.96961637805988, -87.80040437311133)", "(41.83178216737375, -87.65212105205913)", 
"(41.74288426698524, -87.65446169132674)", "(41.915204432408586, -87.69792178619974)", 
"(41.94866316272216, -87.64507984403252)", "(41.81958441681623, -87.60129995144278)", 
"(41.93276073333015, -87.66282287256256)", "(41.9342989284628, -87.70164214047698)", 
"(41.891709424820085, -87.77186284991478)")), row.names = c(16443L, 
17295L, 13838L, 18333L, 4450L, 10010L, 31727L, 666L, 49325L, 
4936L, 29437L, 4034L, 32122L, 5043L, 2114L, 26466L, 3758L, 31484L, 
15699L, 49367L, 20462L, 3004L, 45771L, 45139L, 45780L, 16639L, 
43143L, 13131L, 25258L, 48176L), class = "data.frame")

City shapefile: https://data.cityofchicago.org/api/geospatial/cauq-8yn6?method=export&format=Shapefile

P.S. I wasn't sure how to make this most conveniently reproducible, so I put paths and file names in brackets.

library(ggplot2)
library(readxl)
library(MASS)
library(viridis)
library(sf)
library(RColorBrewer)
library(tidyr)
library(sf)
library(ggthemes)

requests <- read.csv([requests file path])
requests <- requests %>% drop_na(LONGITUDE) %>% drop_na(LATITUDE)

chicago <- st_read([City shapefile path])

ggplot() +
  geom_sf(data = chicago) + 
  stat_density2d(data = requests, mapping = aes(fill=after_stat(level), x = LONGITUDE, y = LATITUDE), h = 0.03, alpha = 0.5, geom = 'polygon', show.legend = FALSE) +
  scale_fill_gradientn(colors = magma(n = 200, alpha = 0.5)) + 
  geom_point(data = requests, size = 0.001, alpha = 0.03, color = 'red', mapping = aes(x = LONGITUDE, y = LATITUDE)) + 
  theme_map()

Alonninos
  • 11
  • 3
  • 2
    Greetings! Usually it is helpful to provide a minimally reproducible dataset for questions here so people can troubleshoot your problems (rather than a table or screenshot for example). One way of doing is by using the `dput` function on the data or a subset of the data you are using, then pasting the output into your question. You can find out how to use it here: https://youtu.be/3EID3P1oisg – Shawn Hemelstrand Jan 21 '23 at 02:34
  • Hi @Shawn, thank you for the kind pointer! I added in a randomly selected sample of 30 requests using dput, as you suggested. Should I have assigned the dput info directly to the requests variable in the code chunk to make it a bit easier? Appreciate your help! – Alonninos Jan 21 '23 at 15:07
  • Looks good! Not sure I can answer your question but this should help others do so if theyre able – Shawn Hemelstrand Jan 21 '23 at 15:32

0 Answers0