9

I plan to do a survey in Switzerland. NPA will be asked.

NPA (postal codes) contains 4 number.

  • For instance 1227 is the NPA of Carouge (part of canton Geneva - Switzerland).
  • For instance 1784 is the NPA of Courtepin (part of canton Fribourg -Switzerland).
  • etc.

I would like to know how to represent all observation (about 1500) on a map. I was thinking using ggplot as I use it for other graphs (I think ggplot is "beautiful"). However, I'm open to any other suggestion.

Here are some fake data: http://pastebin.com/HsuQnLP3

The output for the swiss map should be a bit like that USA map (credit:http://www.openintro.org)

enter image description here

Update:

I've tried to create some code :

library(sp)
test <-  url("https://dl.dropboxusercontent.com/u/6421260/CHE_adm3.RData")
print(load(test))
close(test)

gadm$NAME_3
gadm$TYPE_3

But it seems http://gadm.org/ doesn't provide the NPA of the communes...

New update:

I've find (thanks @yrochat) a shapefile with NPA: http://www.cadastre.ch/internet/cadastre/fr/home/products/plz/data.html

I'ts the ZIP file called : Shape LV03

Then I've tried

library("maptools")
swissmap <- readShapeLines("C:/Users/yourName/YourPath/PLZO_SHP_LV03/PLZO_PLZ.shp")
plot(swissmap)
data <- data.frame(swissmap)
data$PLZ #the row who gives the NPA

As I have the PLZ on a shapefile, how do i color my observation on the map? I provided some fake data on data http://pastebin.com/HsuQnLP3

enter image description here

Thanks

S12000
  • 3,345
  • 12
  • 35
  • 51
  • Do you already have the means to match your survey responses to spatial data (ie. geocode)? If so, you might consider aggregating the responses to a larger spatial scale, similar to what you would see represented by chloropleths: http://blog.revolutionanalytics.com/2009/11/choropleth-challenge-result.html –  Jul 20 '13 at 01:12
  • My advice: add to your question a subset of the real data or some fake data and we will be able to help. Yes, it's probably possible to use `ggplot` for this kind of thing; I have done so many times. Read [this post](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) if you have not already. [This answer](http://stackoverflow.com/questions/13655230/how-do-you-combine-a-map-with-complex-display-of-points-in-ggplot2) may also be relevant. – SlowLearner Jul 20 '13 at 09:27
  • @SlowLearner Hello, I've just added fake data. – S12000 Jul 22 '13 at 18:20
  • @JimM. Sorry, can you explain me more about how to acheive this according to the fake data I provied. I can't understand yet... – S12000 Jul 22 '13 at 18:21
  • 1
    I think we need some way of linking NPAs to cantons, some kind of lookup table that allows us to establish that NPA 1227 is part of the canton of Geneva. Then it should be possible aggregate results for the NPAs to canton level and make a choropleth of that. Currently we have no way to assign NPAs to the map. If you had latitude and longitude for the centre of each NPA, that might work I guess. – SlowLearner Jul 22 '13 at 23:32
  • @SlowLearner Hello, I may get latitude and longitude. However, I don't want to assign all part such as 1227 to a canton. I wish to have a map in which all part (square liked shape in the map) is a part of a canton. A part of a canton is usually called a "commune" in french. – S12000 Jul 23 '13 at 00:26
  • So you want to show the NPA areas on the map? I assume that to do that you will have to find a shapefile that contains the outlines of the Swiss NPA areas in each commune. Do you have access to such data? – SlowLearner Jul 23 '13 at 07:31
  • @SlowLearner "So you want to show the NPA areas on the map? " -> Yes exactly, is what I want to do. I don't have access to such data (shapefile). – S12000 Jul 23 '13 at 13:56
  • @Swiss12000: There should be some shapefiles or spatialPolygonDataframe data at www.gadm.org that should have the spatial level that you require (Administrative Level 3?) –  Jul 23 '13 at 17:44
  • The Swiss data at gadm.org seems to have communes, not sure about postal code areas? Happy to be proven wrong though. – SlowLearner Jul 23 '13 at 22:36
  • @JimM. I've created some code with gadm.org with administrative level 3 ... – S12000 Jul 24 '13 at 12:57
  • @SlowLearner Unfortunately I think you right (cf. updated post) – S12000 Jul 24 '13 at 12:57
  • @all up... can someone help me I'm still stuggling finding a solution to my issue... Thanks – S12000 Jul 31 '13 at 13:23
  • If you don't have a shapefile showing the NPA areas, I don't think help is possible. That's not data we can just magically create. Commune-level data we can get but not NPAs as far as I can see. You do want the NPA areas, right? – SlowLearner Aug 01 '13 at 22:38
  • @SlowLearner I find the shapefile with NPA area (cf. my updated post) – S12000 Aug 06 '13 at 15:27
  • In case anyone is using this post for a similar task, the new URL of the NPA is http://www.cadastre.ch/internet/kataster/de/home/services/service/plz.html – greyBag Jun 06 '16 at 13:01

1 Answers1

9

OK, with the shapefile, we can plot things easily enough.

work.dir <- "directory_name_no_trailing slash"

# open the shapefile
require(rgdal)
require(rgeos)
require(ggplot2)
ch <- readOGR(work.dir, layer = "PLZO_PLZ")

# convert to data frame for plotting with ggplot - takes a while
ch.df <- fortify(ch)

# generate fake data and add to data frame
ch.df$count <- round(runif(nrow(ch.df), 0, 100), 0)

# plot with ggplot
ggplot(ch.df, aes(x = long, y = lat, group = group, fill = count)) +
    geom_polygon(colour = "black", size = 0.3, aes(group = group)) +
    theme()

# or you could use base R plot
ch@data$count <- round(runif(nrow(ch@data), 0, 100), 0)
plot(ch, col = ch@data$count)

Personally I find ggplot a lot easier to work with than plot and the default output is much better looking.

screenshot

And ggplot uses a straightforward data frame, which makes it easy to subset.

# plot just a subset of NPAs using ggplot
my.sub <- ch.df[ch.df$id %in% c(4,6), ]
ggplot(my.sub, aes(x = long, y = lat, group = group, fill = count)) +
    geom_polygon(colour = "black", size = 0.3, aes(group = group)) +
    theme()

Result:

screenshot2

SlowLearner
  • 7,907
  • 11
  • 49
  • 80
  • thank you so much is perfectely working. Is really what I wanted. You made my day. :) Some Swiss people may however notice the absence of some lake such as the Zurich's one. I will write to the website who provide the shape file in order to discover why... – S12000 Aug 07 '13 at 01:47
  • Well, I would think that people do not live in the lake, so they will not get letters, so there is no postal code? Remember this is a map of postal areas only, not physical boundaries. – SlowLearner Aug 07 '13 at 07:13
  • yeah the office told me that they will imrove this issue ASAP. However, I'm affraid but I tried to use a plot according to my fake data `raw <- read.csv("http://pastebin.com/raw.php?i=HsuQnLP3", sep="\t")`. But I cant figure out how to match raw$NPA to the map. I understand it by generating random variables but not with real data... – S12000 Aug 10 '13 at 02:17
  • Well, first of all there's something wrong with the sample data - I get a `Duplicate entry '2147483647' for key 'PRIMARY` error. More to the point, the point of a choropleth is that it colours sections on a map (in this case those are NPAs) according to some underlying data. If your data is arranged `Number,Gender,NPA` then what do you want the colour of each NPA to represent? If the gender, that's only two values. Do you want it to show the survey number? – SlowLearner Aug 10 '13 at 06:20
  • Hello, actually I'm interested to get the NPA by frequency. For instance, in my fake data the NPA 1197 has a frequency of 2 and 1223 only 1. So the shape of 1197 will be darker than the shape of 1223. – S12000 Aug 10 '13 at 13:12
  • I don't see a `frequency` column in the data. How is this column derived? If the frequency column is already there, just replace `ggplot(my.sub, aes(x = long, y = lat, group = group, fill = count))` with `ggplot(my.sub, aes(x = long, y = lat, group = group, fill = frequency))` in the code above. If you don't know how to format your data, ask another, separate question after first searching through Stack Overflow carefully. And please in future include fake data that actually looks like your real data. – SlowLearner Aug 10 '13 at 13:33
  • Ok thanks I will try to work on it. And If I cannot find a solution I create a new post – S12000 Aug 13 '13 at 03:14