1

How can I create lattice chart using spplot() when data are arranged in rows, e.g. there are more values for every region (I have unemployment rate unemp for many years (year) for regions CSO_NAME)?

This is my code to load map and merge data:

library(rgdal)
library(sqldf)

# Import map and assign data.shape@data to spdata
data.shape<-readOGR(dsn="folder",layer="mylayer")    
spdata <- data.shape@data

# Load statistics data
unemp <- read.csv("cso_unemployment_rwise.csv")

# Merge data with spdata
spdata <- sqldf("select sp.*, cu.year, cu.unemp from spdata sp join unemp cu on (sp.nazok_a = cu.CSO_NAME) ")

# Build new spdata
spdata_merged <- SpatialPolygonsDataFrame(data.shape, spdata) 
# This fails: length(Sr@polygons) == nrow(data) is not TRUE

I thought I can use something similar as formula, e.g. like this example for barchart:

barchart(spdata$year~spdata$unemp|spdata$CSO_NAME)

But because I can't merge data with polygons I do not know what should be the next step. I can transpose data easily in this case and then use something like:

spplot(spdata,c("y2009","y2010","y2011","y2012",...))

Reproducible Example

Here are sample data, stats_data with only one grouping variable year and stats_data2 with two grouping variables year and sex

# Get map
con <- url("http://gadm.org/data/rda/CZE_adm2.RData")
print(load(con))
close(con)
gadm_data <- gadm@data

# Create sample Data
stats_data <-
  data.frame(

      as.character(rep(gadm_data$NAME_2,3)),
      as.numeric(round(runif(3*length(gadm_data$NAME_2), 0, 1),digits=3)*100),
      as.factor(rep(c(2010,2011,2012),length(gadm_data$NAME_2)))
  )
names(stats_data) <- c("NAME_2","UNEMPR","YEAR") # str(stats_data)

# Add each year to map data
library("sqldf")
gadm_data <- sqldf("select gd.*, sd.UNEMPR as u2010 from gadm_data gd join stats_data sd using (NAME_2) where year = 2010")
gadm_data <- sqldf("select gd.*, sd.UNEMPR as u2011 from gadm_data gd join stats_data sd using (NAME_2) where year = 2011")
gadm_data <- sqldf("select gd.*, sd.UNEMPR as u2012 from gadm_data gd join stats_data sd using (NAME_2) where year = 2012")
gadm@data <- gadm_data

# Plot     
spplot(gadm,c("u2010","u2011","u2012"),at=c(0,10,20,30,40,50,70,100))

# Create sample Data, two factor variables
stats_data2 <-
  data.frame(

    as.character(rep(gadm_data$NAME_2,6)),
    as.numeric(round(runif(6*length(gadm_data$NAME_2), 0, 1),digits=3)*100),
    as.factor(rep(c(2010,2011,2012),2*length(gadm_data$NAME_2))),
    as.factor(c("f","m"))
  )
names(stats_data2) <- c("NAME_2","UNEMPR","YEAR","SEX") # str(stats_data2)

I can do the ugly data manipulation using sqldf, but this gets more and more complicated with more factors added. Suppose I have 2 factors with 2 and 10 values then I have to add 20 columns.

R version 2.15.1, Windows XP, SP3

Tomas Greif
  • 21,685
  • 23
  • 106
  • 155
  • You should have one row of data per polygon. It is hard to rewrite your code without a toy example. Always try to make questions [reproducible](http://stackoverflow.com/q/5963269/1478381) – Simon O'Hanlon Mar 07 '13 at 14:19
  • @SimonO101 I have added easily reproducible example to show how ugly code it is even for only one factor variable. – Tomas Greif Mar 07 '13 at 15:41
  • Great! I will have a look later when I get a chance to see if I can help you, and now you have a reproducible example I'm sure others will see if they can help too. – Simon O'Hanlon Mar 07 '13 at 15:56

0 Answers0