Producing a 1 km grid of mean pond areas using coordinates of ponds and their pond areas

Question

I have the northing and easting coordinates of 17,306 ponds in Kent, England and the areas of almost all of the ponds in square metres. I am trying to create a 1 km grid which gives the mean pond area for each grid square while giving 0 values for grid squares with no ponds. I have been looking for questions similar to mine and have found one where a Thin Plate Spline algorithm is used to produce gridded rainfall data over the UK and then plot the surface onto this grid and write the data into a table (How to produce gridded output in R and eliminate grid squares that are not over land?).

I have been able to use this code on a small amount of my data to produce a similar result. Below is an example with a small amount of my data.

dput(head(KentPonds, 10))    

structure(list(Eastings = c(572745.0557, 578793.9616, 573157.8562, 
573664.2026, 572735.0952, 572738.741, 572742.0182, 572281.0791, 
572267.6893, 573673.0182), Northings = c(179326.0157, 179249.0268, 
179184.2076, 179173.6464, 179148.6766, 179123.1966, 179067.6473, 
179050.8956, 178994.7816, 178996.945), PondArea_sqm = c(448L, 
85L, 52L, 183L, 318L, 511L, 276L, 330L, 772L, 203L)), .Names = c("Eastings", 
"Northings", "PondArea_sqm"), row.names = c(NA, 10L), class = "data.frame")

#looks like this
Eastings Northings PondArea_sqm
572745.1  179326.0          448
578794.0  179249.0           85
573157.9  179184.2           52
573664.2  179173.6          183
572735.1  179148.7          318
572738.7  179123.2          511
572742.0  179067.6          276
572281.1  179050.9          330
572267.7  178994.8          772
573673.0  178996.9          203

library(fields)
library(maptools)
library(gstat)


names(KentPonds) <- c("Eastings", "Northings", "PondArea_sqm") 

fit <- Tps(cbind(KentPonds$Eastings,KentPonds$Northings),KentPonds$PondArea_sqm)
surface(fit)

xvals <- seq(500000, 650000, by=1000)
yvals <- seq(115000, 190000, by=1000)

griddf <- expand.grid(xvals, yvals)
griddf$pred <- predict(fit, x=as.matrix(griddf))
write.table(griddf, file="PArea1000_Grid(1).csv", sep=",", qmethod="double")

Using all 17,306 ponds it crashes my computer but more importantly I was hoping to be able to adapt it in some way so rather than predicted values produced by the tps I would get what I want i.e. the mean pond area for each grid square. I have been finding this very difficult so I would be very grateful if anyone could provide a solution or point me in the right direction.

Kind regards,

Aidan

To make this reproducible: run `dput` on `KentPonds` and paste the result at the head of your example (e.g., `KentPonds <- structure(list( ... ), class = "data.frame")`). Also, IMHO, your spatial methodology may produce fairly large error. A 1 km grid is probably too fine a resolution for aggregating pond area on the basis of centroid location alone. — Noah, Feb 19 '14 at 19:12
Interestingly in Kent at least at low pond densities where error may be high pond size is highly skewed towards smaller ponds. However you may be right and it may be necessary to increase the grid size but I am still left with my problem of inadequate R skills to create the grid. — user2358636, Feb 19 '14 at 22:16

score 1 · Answer 1 · answered Feb 20 '14 at 09:08

How about just creating some additional columns to indicate the grid reference, then use dplyr to create the aggregate measures. I've also shown it plotted. PS I shrunk your xvals and yvals range so that the items in the dataset provided would be more visible:

KentPonds<-read.table(header=T,text="Eastings Northings PondArea_sqm
572745.1  179326.0          448
578794.0  179249.0           85
573157.9  179184.2           52
573664.2  179173.6          183
572735.1  179148.7          318
572738.7  179123.2          511
572742.0  179067.6          276
572281.1  179050.9          330
572267.7  178994.8          772
573673.0  178996.9          203
")

xvals <- seq(570000, 575000, by=1000)
yvals <- seq(178000, 181000, by=1000)

KentPonds$x<-ceiling(KentPonds$Eastings/1000)*1000
KentPonds$y<-ceiling(KentPonds$Northings/1000)*1000

require(dplyr) # for aggregation
require(ggplot2) # for plotting

ponds_sqkm<-group_by(KentPonds,x,y) %.% 
  summarise(mean=mean(PondArea_sqm),total=sum(PondArea_sqm))

plotdata<-merge(ponds_sqkm,expand.grid(x=xvals,y=yvals),all.y=T)

ggplot(plotdata) + theme_bw() +
  geom_tile(aes(x,y,fill=mean)) +
  scale_fill_continuous(low="white",high="red",na.value="white")

enter image description here

Thanks Troy that worked perfectly using all my data and saved me a lot of time. — user2358636, Feb 20 '14 at 14:04

Producing a 1 km grid of mean pond areas using coordinates of ponds and their pond areas

1 Answers1