I have a series of geographical positions at sea which I am trying to get geological sediment type information for. I am using an export of the national british geological sediment database (df1)which is a large data set of coordinates and sediment information. Currently I have been rounding the coordinates in the BGS export file (df1) and averaging/recalculating the sediment type for these coordinate squares, then I have rounded my coordinates in (df2) and matched these to these squares to get a sediment classification.
The BGS export looks like this (df1);
NUM X Y GRAV SAND MUD
1 228 1.93656 52.31307 1.07 98.83 0.10
2 142 1.84667 52.45333 0.00 52.60 47.40
3 182 1.91950 52.17750 9.48 90.38 0.14
4 124 1.88333 52.70833 0.00 98.80 1.20
5 2807 1.91050 51.45000 2.05 97.91 0.05
6 2787 1.74683 51.99382 41.32 52.08 6.60
7 2776 1.66117 51.63550 9.83 87.36 2.81
8 2763 1.82467 51.71767 43.92 47.25 8.83
9 2753 1.76867 51.96349 57.66 39.18 3.15
10 68 2.86967 52.96333 0.30 98.90 0.80
11 2912 1.70083 51.77783 26.90 64.87 8.22
12 2914 1.59750 51.88882 32.00 65.02 2.97
13 2886 1.98833 51.34267 1.05 98.91 0.04
14 2891 1.87817 51.31549 68.57 31.34 0.08
15 2898 1.37433 51.41249 35.93 61.48 2.59
16 45 2.06667 51.82500 9.70 88.10 2.20
17 2904 1.63617 51.45999 16.28 66.67 17.05
My positions at sea look like this (df2);
haul DecStartLat DecStartLong
1993H_2 55.23983 -5.512830
2794H_1 55.26670 -5.516700
1993H_1 55.27183 -5.521330
0709A_71 55.26569 -5.519730
0396H_2 55.44120 -5.917800
0299H_2 55.44015 -5.917310
0514A_26 55.46897 -5.912167
0411A_64 55.47289 -5.911820
0410A_65 55.46869 -5.911930
0514A_24 55.63585 -5.783500
0295H_4 55.57250 -5.754300
0410A_62 55.63656 -6.041870
0413A_53 55.73280 -6.020600
0396H_13 55.66470 -6.002300
2794H_8 55.83330 -5.883300
0612A_15 55.84025 -5.912130
0410A_74 55.84311 -5.910180
0299H_16 55.90568 -5.732490
0200H_18 55.88600 -5.742900
0612A_18 55.90450 -5.835880
This is my script...
get.Sed.type <- function(x,y) {
x$Y2 <- round(x$Y, digits=1)
x$X2 <- round(x$X, digits=1)
x$BGSQ <- paste(x$Y2,x$X2,sep="_")
x$RATIO <- x$SAND/x$MUD
x <- aggregate(cbind(GRAV,RATIO)~BGSQ,data=x,FUN=mean)
FOLK <- (x$GRAV)
FOLK[(FOLK)<1] <- 0
FOLK[(FOLK)>=1&(FOLK)<5] <- 1
FOLK[(FOLK)>=5&(FOLK)<30] <- 5
FOLK[(FOLK)>=30&(FOLK)<80] <- 30
FOLK[(FOLK)>=80] <- 80
R_CLASS <- (x$RATIO)
R_CLASS[(R_CLASS)<1/9] <- 0
R_CLASS[(R_CLASS)>=1/9&(R_CLASS)<1] <- 0.1
R_CLASS[(R_CLASS)>=1&(R_CLASS)<9] <- 1
R_CLASS[(R_CLASS)>=9] <- 9
x$FOLK_CLASS <- NULL
x$FOLK_CLASS[(R_CLASS)==0&(FOLK)==0] <- "M"
x$FOLK_CLASS[(R_CLASS)%in%c(0,0.1)&(FOLK)==5] <- "gM"
x$FOLK_CLASS[(R_CLASS)==0.1&(FOLK)==0] <- "sM"
x$FOLK_CLASS[(R_CLASS)==0&(FOLK)==1] <- "(g)M"
x$FOLK_CLASS[(R_CLASS)==0.1&(FOLK)==1] <- "(g)sM"
x$FOLK_CLASS[(R_CLASS)==9&(FOLK)==0] <- "S"
x$FOLK_CLASS[(R_CLASS)==1&(FOLK)==0] <- "mS"
x$FOLK_CLASS[(R_CLASS)==9&(FOLK)==1] <- "(g)S"
x$FOLK_CLASS[(R_CLASS)==1&(FOLK)==1] <- "(g)sM"
x$FOLK_CLASS[(R_CLASS)==1&(FOLK)==5] <- "gmS"
x$FOLK_CLASS[(R_CLASS)==9&(FOLK)==5] <- "gS"
x$FOLK_CLASS[(FOLK)==80] <- "G"
x$FOLK_CLASS[(R_CLASS)%in%c(0,0.1)&(FOLK)==30] <- "mG"
x$FOLK_CLASS[(R_CLASS)==1&(FOLK)==30] <- "msG"
x$FOLK_CLASS[(R_CLASS)==9&(FOLK)==30] <- "sG"
y$Lat <- round(y$DecStartLat, digits=1)
y$Long <- round(y$DecStartLong, digits=1)
y$LATLONG100_sq <- paste(y$Lat,y$Long,sep="_")
y <- merge(y, x[,c(1,4)],all.x=TRUE,by.x="LATLONG100_sq",by.y="BGSQ")
#Delete unwanted columns
y <- y[, !(colnames(y) %in% c("Lat","Long","LATLONG100_sq"))]
#Name column something logical
colnames(y)[colnames(y) == 'FOLK_CLASS'] <- 'BGS_class'
return(y)
}
However I have a dozen or so positions in db2 for which there are no corresponding values in the BGS export (db1), I want to know how I can either ask it to do another average for the squares surrounding that respective square (i.e. round to a larger number and repeat the process) OR to ask it to find the coordinate in the BGS export file that is closest in proximity and take the existing value.