5

I would like to merge polygon data and raster data into one dataframe for purposes of then using randomForests package in R.
This involves first extracting the mean raster value per polygon.

So far, I have the following:

#load libraries
library(raster)
library(rgdal)
library(sp)
library(maptools)

#import raster data 
r <- raster("myRasterdata.tif")

#import polygon data 
p <- readShapePoly("myPolydata.shp")

#extract mean raster value for each polygon
ExtractMyData <- extract(r, p, small=TRUE, fun=mean, na.rm=TRUE, df=FALSE,  nl=1, sp=TRUE)
# note I have also tried this with df=TRUE and sp=FALSE

The output is a matrix, which I can write to a dataframe. But it does not have the spatial coordinates or the original polygon IDs, so I don't know how to join the output into the same database.I thought the sp=TRUE argument would do this, but it doesn't seem to work.

Note that I will actually have to convert the polygons to points (using a centroid method?) for purposes of RandomForests so I could guess what I really want is to join the mean raster values joined to points, not polygons.

Any suggestions would be greatly appreciated. Thank you!!

user3251223
  • 265
  • 1
  • 7
  • 15
  • It does look like the sp=TRUE option should do what you want. What is in p@data? Also, why do you have nl=2? – Jesse Anderson Mar 11 '14 at 18:44
  • oops, sorry nl should be 1 in this example. I was using nl=2 because I actually had a stack of rasters that I was working with but simplified it to 1 for purposes of asking my question. – user3251223 Mar 11 '14 at 18:45
  • class(p) tells me that it is a "SpatialPolygonsDataFrame". It has 12 variables (1 of which is the POLY_ID) and 6938 observations. So 6938 polygons that indicate presence or absence (1 or 0) of each of 8 species. – user3251223 Mar 11 '14 at 18:52
  • Sorry, I meant what is p@data after you run the extract...it looks like that's where the values should be if sp=T – Jesse Anderson Mar 11 '14 at 19:01
  • hmmm... I think I need to do writePolyShape(p,"p-output") but it doesn't seem to produce anything, so my p ends up looking the same as the input. – user3251223 Mar 11 '14 at 20:19
  • Oops sorry, when I used writePolyShape(p,"p-output") it did actually produce a shapefile, and there is an added column of "SP_ID" which is the same ID's as the extract raster values, so it looks like all I have to do now is join based on that. Perfect! – user3251223 Mar 11 '14 at 20:27

1 Answers1

7

This works:

library(raster)
library(sp)
library(maptools)


#import polygon data 
data(wrld_simpl)
p <- wrld_simpl

#create raster data 
r <- raster(extent(p))
r[] <- seq_len(ncell(r))


## this does it directly, adding columns "names(r)" to "p" 
p <- extract(brick(r, r * 2), p, fun = mean, na.rm = TRUE, sp = TRUE)

You can also do it more manually, see how extract with an aggregating function gives a single column vector:

p$ExtractData <- extract(r, p, fun = mean, na.rm = TRUE)

Or you could work on a multi-layer raster, column by column like this:

b <- brick(r, r * 2)
extr <- extract(b, p, fun = mean, na.rm = TRUE)
for (i in seq_len(ncol(extr))) p[[colnames(extr)[i]]] <- extr[,i]
mdsumner
  • 29,099
  • 6
  • 83
  • 91