I need to set for each municipality i the nearest value from a list of municipalities. Every municipality in this list may have a specific price value if it produces a commodity or have a NA if it does not produce in that year. Therefore, i can either: a) have its own price; b) have the price from a closest neighbor, and this may change for each year (2000 through 2014).
This database is in a shapefile imported to R through SP <- readOGR("MT.shp")
where N1 is the ID number, N2 is the municipality code, N3 is its name and N4 through N18 are the prices (when available):
$ N1 : Factor w/ 287 levels "1","10","100",..: 287 1 112 210 221 232 243 254 265 276 ...
$ N2 : Factor w/ 287 levels "1100015","1100023",..: 287 53 54 55 56 57 58 59 60 61 ...
$ N3 : Factor w/ 287 levels "AGUA AZUL DO NORTE",..: 148 2 3 4 5 1 6 7 10 16 ...
$ N4 : Factor w/ 39 levels "0.1999605989",..: 39 NA NA NA NA NA NA NA 32 NA ...
$ N5 : Factor w/ 49 levels "0.2099991931",..: 49 NA NA NA NA NA NA NA 42 NA ...
$ N17: Factor w/ 103 levels "0.5011494253",..: 103 NA NA NA NA NA NA NA 55 NA ...
$ N18: Factor w/ 100 levels "0.75","0.8092253594",..: 100 NA NA NA NA NA NA NA 93 NA ...
Therefore, I need to fill all NAs (total of 2965) with the value from the closest municipality with a valid number. I read a similar guidance from:
but that did not apply equally for my dataset.
Is there any formula to do this task?
Thank you.
Following @StatnMap help, I calculated all minimum distances:
> distmin <- spDists(SP, longlat = TRUE); class(distmin)
[1] "matrix"
I also have a matrix called psoy with dimension (287 x 15). An example:
X2000 X2001 X2002
[1,] NA NA NA
[2,] 0.2500000 0.2866667 0.3600000
[3,] NA NA NA
[4,] 0.6500390 0.5598765 0.4000000
[5,] 0.7500390 0.5598765 NA
[6,] 0.6000000 NA 0.3500000
[7,] 0.4200150 0.4700000 0.3000000
[8,] NA NA 0.4000000
For each column of psoy, let's say X2000, I need to find the nearest municipality with a valid (non NA) value. I created a new dataframe A trying to manage this problem by year:
> A <- data.frame(psoy$X2000, distmin)
For municipality X1, for instance, A would be (first column are prices and second are distances from X1 to neighbors):
X2000 X1
[1,] NA 0.00000
[2,] 0.2500000 360.45196
[3,] NA 62.37937
[4,] 0.6500390 261.60197
[5,] 0.7500390 583.45032
[6,] 0.6000000 696.34611
[7,] 0.4200150 600.53779
[8,] NA 764.88325
So, in this example, X1 does not have a value and should return the value from the nearest municipality. The answer I need is 0.6500390 from [4,] substituting NA in row [1,], column X2000.
I tried several commands to make it automatic, but none worked. The best I did was:
A$psoy00r <- ifelse(!is.na(A$psoy.X2000), A$psoy.X2000,
with(A, psoy.X2000[which.min(X1)], na.rm = TRUE))
but it still kept NAs where they already existed.
I appreciate any help and thank you for attention.