4

I'm plotting a map here and can't fill it right, I hope you can help. So what do I do

Read the map and change projection:

map<-readOGR(dsn="e:\\r\\OSM_adm_reg\\adm4_region (1)", layer="adm4_region")
map=spTransform(map,CRS("+proj=latlong +ellips=GRS80"))

Read data about location of logistics warehouses throughout the country and match the world.cities database to plot them on the map:

    data(world.cities)
    russian.cities=world.cities[world.cities$country.etc=="Russia",]
    metro_outlets=read.csv(file="e:\\r\\retail\\metro.csv",head=TRUE,sep=";")
    match<-match(metro_outlets$CITY_ENGLISH, russian.cities$name)
    outlet_coordinates=russian.cities[match,]
    outlet_coordinates=cbind(outlet_coordinates, "metro_store_count"=metro_outlets$STORE_COUNT_METRO)
    outlet_coordinates=cbind(outlet_coordinates, "cch_depo"=metro_outlets$DEPO_FOOD)
    outlet_coordinates=cbind(outlet_coordinates, "NAME_LAT"=metro_outlets$STORE_OBLAST)
    outlet_coordinates=cbind(outlet_coordinates, "cch_franchise"=metro_outlets$Franchise)
outlet_coordinates=cbind(outlet_coordinates, "SL"=metro_outlets$SL)

Basically, I'm gonna need to fill regions based on the values of the SL column. Now, Im trying to get a list of regions that contains the WHs:

outlet_spatial=SpatialPoints(outlet_coordinates[,c(5,4)],proj4string=CRS(proj4string(map)))
index=over(outlet_spatial,map)
region_names=rownames(map@data[map@data$NAME_LAT %in% index$NAME_LAT,])
map_df<-fortify(map,map$NAME_LAT)
map_df$fill=ifelse(map_df$id %in% region_names,"true","false")

And plot the map:

ggplot()+
  geom_polygon(data=map_df,aes(x=long,y=lat,group=group,fill=fill),col="gray")+
  scale_fill_manual(values=c("gray","gray70"))+xlim(20, 180) + ylim(40,80)+
  geom_point(aes(x=long,y=lat), color="orange",alpha=I(8/10), data=outlet_coordinates)

This is the map I got!

Now, I don't know how to pass the values of the SL column to the map_df and fill regions accordingly. SL defined here: outlet_coordinates=cbind(outlet_coordinates,"SL"=metro_outlets$SL)

Stedy
  • 7,359
  • 14
  • 57
  • 77
user1851061
  • 43
  • 1
  • 4
  • I want to help, and I don't suppose it will even be too time consuming to do so, but that's a lot of code you have posted to extract the useful bits to help formulate you a solution. The number one thing that will help get you an answer: make it [REPRODUCIBLE](http://stackoverflow.com/q/5963269/1478381). Try pasting the structure of some of your data. Specifically `dput( head( metro_outlets ) )`. Also, in future, try to only post the most relevant bits of code. The shorter, the easier it is to follow. – Simon O'Hanlon Mar 09 '13 at 17:27
  • Thank you, Simon. Absolutely, I agree the code looks like mess - I have done it from many sources on the internet and have no deep knowledge of R, so it takes me an hour just to get a grip of what is going on:) About data: 1) metro outlets: http://picpaste.com/pics/osNJ1HfR.1362851019.jpg - click on the pic to open it full screen. 2) outlets_coordinates: http://picpaste.com/pics/4y8KelZB.1362851276.jpg – user1851061 Mar 09 '13 at 17:46
  • No problem. We all start off as learners. :-) The pics aren't so helpful. Try typing this command into R and pasting the output as an edit to your question, or as a comment if you must... `dput( head( metro_outlets ) )`. You are most of the way there. What you need to do is to transfer the SL attribute to the regions map. One question - are you sure that you only have one metro outlet in each map region and if not, do all the metro outlets in the same region have the same SL code? – Simon O'Hanlon Mar 09 '13 at 17:51
  • What happens in the script: I load locations from the cvs file, match it with the cities database and get its long&lat, now im able to put them on the map. Then I find regions that contain those cities using OVER function, and set a fill field to the regions. What I dont know is how do I pass the SL value from outlets_coordinates to the map and how to fill the map accordingly (low level - red, high level - green) – user1851061 Mar 09 '13 at 17:52
  • well, there more than 1 metro outlet per region, but SL is an average per region, so for every metro outlet in one region you have similar SL value – user1851061 Mar 09 '13 at 17:57
  • http://sharetext.org/RC41 - i can share text thru another site, though) – user1851061 Mar 09 '13 at 18:02
  • If this is a one-off you might want to consider QGIS or something similar. Beyond that, it looks as if you are passing a boolean in string format to the fill parameter of ggplot. Your fill value should be numeric, as the SL? – ako Mar 09 '13 at 18:11
  • @ako I'm passing the boolean just so it knows what regions should be filled. - Once I know how to pass SL numerical I'd set the fill option to SL. I feel like im surrounded by millions of tools I just dont know how to use them) – user1851061 Mar 09 '13 at 18:21
  • I see. For the record, an easy way of sharing data is to upload your .rda data object to a public folder on dropbox if you have an account there. Then you can load it from R with load(url('http://dl.dropbox.com/u/some_ID_NUMBER/mydata.rda')) – ako Mar 09 '13 at 18:50
  • Yes, see below (hopefully) – Simon O'Hanlon Mar 10 '13 at 04:12
  • @SimonO101 Hi, Simon, thank you for the code. Although it doesnt seem to work, it generates an error(below), it seems like you are passing CITY_ENGLISH but not STORE_OBLAST(which is similar to what is kept in the map). Link to the shape: http://gis-lab.info/data/yav/adm/stable/adm4_region.7z : Error: TopologyException: found non-noded intersection between LINESTRING (56.5052 75.0685, 56.6269 75.0832) and LINESTRING (56.6119 75.0821, 56.6045 75.0805) at 56.604450059013843 75.080522366110173. – user1851061 Mar 10 '13 at 13:59
  • I tried to pass it another way, istead of boolean I passed the value: map_df$SL=ifelse(map_df$id %in% region_names,index_and_data$SL,"false") but im not sure what values it passes – user1851061 Mar 10 '13 at 14:23
  • @SimonO101 This is what I get using qplot -> http://picpaste.com/pics/YPtXY5eK.1362926085.jpeg The values are wierd, they are all more than 1, but supposed to be between 0 and 1... – user1851061 Mar 10 '13 at 14:35
  • @SimonO101 Doest work for me, dunno why. The process is corrupted somewhere aroud df assignment. What have I done wrong? thats what I get, df is all na, as a result there is a map with no touch of colour to it and no SL values> [link]http://picpaste.com/pics/dl3nrlGT.1363028051.jpg – user1851061 Mar 11 '13 at 18:55
  • @SimonO101 even if it worked there is still something wrong about the map of yours. there are only six overlay, although if you look at my map there more overlays http://picpaste.com/Rplot-PdQyfUgR.jpeg – user1851061 Mar 11 '13 at 19:00
  • I get those warnings as well. It is because `SL` is being treated as a factor variable. But when I run it, it doesn't matter. I have updated the code and you can try again. In the meantime can you post the output from `summary(metro_outlets$SL)`. When you post code try to surround it with "`" characters so it looks like code when you post it. – Simon O'Hanlon Mar 11 '13 at 19:01
  • Obviously!! Remember the data you posted. `head( dput( metro_outlets ) )` - head() just gives me the first 6 rows, a sample of your data. That is why I only have 6. If you want me to run it on your full data set you need to share the output of `dput( metro_outlets )`. Can you please try running the code again. In the meantime I suggest you read up thoroughly on the functions I have used to understand what they do. – Simon O'Hanlon Mar 11 '13 at 19:02
  • I also had to make one more edit to the code I posted. I had `long` and `lat` the wrong way round in what I posted, but correct on my computer. Unless there is something very different about your full dataset it *has* to work! :-) – Simon O'Hanlon Mar 11 '13 at 19:14
  • @SimonO101 you a genuis, thank you for your time, mr.brain, very very well done. One last question - why SL are all above 1, but on the map SL range is from 1 to 30?. http://picpaste.com/Rplot01-VxO7i4UG.jpeg – user1851061 Mar 11 '13 at 19:27

1 Answers1

3

This was a little more involved than I thought, mainly because the TOPOLOGY errors you got are caused by poor geometry in your shapefile. Perhaps you could instead use good quality, publically available shapefiles. I suggest www.naturaleartchdata.com. I have included a link and downloaded that data in the script. You can copy and paste this. This should work:

require(ggplot2)
require(maptools)
require(sp)

# Your shape file has some topology errors meaning it can't be used by 'over()'
# Use open source shapefiles from www.naturalearth.com - v. good quality
oldwd <- getwd(); tmp <- tempdir(); setwd(tmp)
url <- "http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces_shp.zip"
dest <- paste(tmp,"\\tmp.zip",sep="")
download.file( url , dest )
unzip(dest)


#   Projection information for geographic data (but in decimal degrees)
projgeo <- CRS("+proj=latlong +datum=WGS84 +ellps=WGS84")


# Read in world shapefile
wld <- readShapePoly("ne_10m_admin_1_states_provinces_shp" , proj4string = projgeo)
# Subset to Russia
map <- wld[wld$admin == "Russia" , ]


# City and metro data
r.city <- world.cities[ world.cities$country.etc=="Russia" , ]
metro_outlets <- read.csv(file="e:\\r\\retail\\metro.csv",head=TRUE,sep=";")


# Assign metro outlets to city and make spatial point data frame
met <- subset( metro_outlets , select = c( "SL" , "CITY_ENGLISH" ) )
met$SL <- as.numeric( met$SL )
match <- match( met$CITY_ENGLISH , r.city$name )
coordinates( met ) <- r.city[ match , c( "long" , "lat" ) ]
proj4string( met ) <- proj4string( map )


# Assign metro outlet attribute "SL" to each region of map
# Find the average SL for each region using 'over()'
df <- over( map , met[ , "SL" ] , fn = mean , na.rm = TRUE )
map$SL <- as.numeric( df$SL )


# Convert th map into format for plotting with ggplot
map@data$id <- rownames( map@data )
map.df <- as.data.frame( map )
map.fort <- fortify( map , region = "id" )
map.gg <- join( map.fort , map.df , by = "id" )


# Plot  
p1 <-   ggplot( NULL ) + 
    geom_polygon( mapping = aes( long , lat , group = group , fill  = factor( SL ) ) , colour = "white" , data = map.gg , map = map.gg )+
    scale_x_continuous( limits = c( 20 , 180 ) )
print(p1)

enter image description here

Hope that helps!

Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184