Rasterize polygons based on maximum overlap (using R packages terra or stars)

Question

I have a question concerning rasterization of polygons by maximum overlap, i.e assign the value of the polygon that has the highst area overlap with the raster cell.

The real world exercise is to rasterize polygons of soil-IDs in R, in order to produce relatively low resolution maps of soil properties as model inputs.

The problem is that the rasterize() function of the terra package (and similar stars' st_rasterize()) assigns the cell value from the polygon that contains the cell midpoint. If a raster cell contains multiple polygons, I would rather like to select the value of the polygon (soil-ID), which has the highest aerea cover in a raster cell.

Here is a small self-contained example that visualizes my problem, using terra.

library(terra)

f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)
r <- rast(v, ncols = 3, nrow = 3)
rcc <- vect(xyFromCell(r, cell = 1:ncell(r)))

x <- rasterize(v, r, field = "NAME_2")
plot(x)
lines(r, col = "light gray")
lines(v)
points(rcc)

Mostly, the polygons that contain the cell center also seem to have the highest area share. However, in some cases (top row, 3rd cell), this is not the case. The problem appears to get worse the bigger the cells are compared with the polygons. I could therefore start with high resolution raster, and than resample to the desired (lower) resolution, using an aggregation function (e.g. the mode). But, maybe someone has a more efficient idea?

Thank you for your help!

Robert Hijmans · Answer 1 · 2022-02-13T07:20:20.040

You could do that like this:

#Example data
library(terra)    
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)
r <- rast(v, ncols = 3, nrow = 3)

n <- 10
r <- disagg(r, n)
r <- rasterize(v, r, "ID_2")
x <- aggregate(r, n, "modal")

plot(x)
lines(x)
lines(v, lwd=2)
text(v, col="red", halo=T)
text(x, col="blue", halo=T)

Another way, probably less efficient (especially if you have many IDs):

z <- lapply(1:nrow(v), \(i) rasterize(v[i,], r, cover=TRUE))
z <- which.max(rast(z))

But you could replace rasterize with exactextractr::coverage_fraction if you want very high precision

Even less efficient, I suppose:

r <- rast(v, ncols = 3, nrow = 3)
values(r) <- 1:ncell(r)
# get weights
e <- extract(r, v, weights=TRUE)
e <- as.matrix(e)
head(e)
#    ID lyr.1 weight
#[1,]  1     1   0.38
#[2,]  1     2   0.49
#[3,]  2     2   0.06
#[4,]  2     4   0.05
#[5,]  2     5   0.52
#[6,]  2     6   0.06

# find cell with max weight (you can use dplyr or data.table intead) 
x <- sapply(unique(e[,2]), function(i) { 
    d <- e[e[,2] == i, ,drop=FALSE]
    d[which.max(d[,3]), 2:1]
})

# remove values 
r <- rast(r)
# assign ID to cells
r[x[1,]] <- x[2,]

You could achieve the same with using polygon intersection, but that does not scale well to large rasters

r <- rast(v, ncols = 3, nrow = 3)
values(r) <- 1:9
v$ID <- 1:nrow(v)
i <- intersect(v[,"ID"], as.polygons(r))
i$area <- expanse(i)
i <- data.frame(i) 
x <- sapply(split(i, i[,2]), 
    \(x) { x[which.max(x[,3]), 2:1] |> unlist()}
)
r <- rast(r)
r[x[1,]] <- x[2,]

(perhaps not as elegant as st_join proposed by lovalery)

Thanks a lot, Robert! Your 1st solution puts my inital idea (rasterize in high resolution and aggregate) and probably is the better approach for larger target rasters. I will also try out your second idea. — paulsw, Feb 11 '22 at 08:33
Thanks a lot for sharing all the different possibilities using only `terra`. This is really very enlightening. Cheers. — lovalery, Feb 11 '22 at 11:50

lovalery · Accepted Answer · 2022-02-10T14:38:50.427

Please find one possible solution using terra and sf libraries.

The idea is to convert the SpatRaster r into a SpatVector and then into an sf object in order to take advantage of the sf::st_join() function using the largest = TRUE argument. The rest of the code then consists of simply converting the sf object back into a SpatVector and then a SpatRaster using the terra::rasterize() function.

So, please find below a reprex that details the procedure.

Reprex

Code

library(terra)
library(sf)

# Your data
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)
r <- rast(v, ncols = 3, nrow = 3)
rcc <- vect(xyFromCell(r, cell = 1:ncell(r)))

# Convert the 'SpatRaster' 'r' into a 'SpatVector (i.e. 'r_poly')
r_poly <- terra::as.polygons(r)

# Convert 'r_poly' into a 'sf' object (i.e. 'r_poly_sf')
r_poly_sf <- sf::st_as_sf(r_poly)

# Convert 'v' into a 'sf' object (i.e. 'v_sf')
v_sf <- sf::st_as_sf(v)

# Left join r_poly_sf with v_sf based on the largest overlap
results_sf <- sf::st_join(r_poly_sf, v_sf, largest = TRUE)

# Convert 'results_sf' into a SpatVector (i.e. 'results_vect')
results_vect <- terra::vect(results_sf)

# Rasterize 'results_vect' to get a 'SpatRaster' (i.e. 'results')  
results <- terra::rasterize(results_vect, r, field = "NAME_2")

Output

NB: please note that the cell of the upper right corner is NA because no polygon from r overlaps v (if needed you can still set the value for cells that do not overlap by using the background= argument inside the terra::rasterize() function).

results
#> class       : SpatRaster 
#> dimensions  : 3, 3, 1  (nrow, ncol, nlyr)
#> resolution  : 0.2613707, 0.2446047  (x, y)
#> extent      : 5.74414, 6.528252, 49.44781, 50.18162  (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326) 
#> source      : memory 
#> name        :   NAME_2 
#> min value   : Capellen 
#> max value   :   Remich

terra::values(results, dataframe=TRUE)
#>       NAME_2
#> 1   Clervaux
#> 2   Clervaux
#> 3       <NA>
#> 4    Redange
#> 5     Mersch
#> 6 Echternach
#> 7   Capellen
#> 8 Luxembourg
#> 9     Remich

Visualization

plot(results)
lines(r, col = "light gray")
lines(v)
points(rcc)

^{Created on 2022-02-10 by the reprex package (v2.0.1)}

Exactly the same as above but with rast(v, ncols = 5, nrow =5) to get a result that can be compared with the figure you gave in your question.

library(terra)
library(sf)

# Your data
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)
r <- rast(v, ncols = 5, nrow = 5)
rcc <- vect(xyFromCell(r, cell = 1:ncell(r)))

# Convert the 'SpatRaster' 'r' into a 'SpatVector (i.e. 'r_poly')
r_poly <- terra::as.polygons(r)

# Convert 'r_poly' into a 'sf' object (i.e. 'r_poly_sf')
r_poly_sf <- sf::st_as_sf(r_poly)

# Convert 'v' into a 'sf' object (i.e. 'v_sf')
v_sf <- sf::st_as_sf(v)

# Left join r_poly_sf with v_sf based on the largest overlap
results_sf <- sf::st_join(r_poly_sf, v_sf, largest = TRUE)
#> Warning: attribute variables are assumed to be spatially constant throughout all
#> geometries

# Convert 'results_sf' into a SpatVector (i.e. 'results_vect')
results_vect <- terra::vect(results_sf)

# Rasterize 'results_vect' to get a 'SpatRaster' (i.e. 'results')  
results <- terra::rasterize(results_vect, r, field = "NAME_2")
results
#> class       : SpatRaster 
#> dimensions  : 5, 5, 1  (nrow, ncol, nlyr)
#> resolution  : 0.1568224, 0.1467628  (x, y)
#> extent      : 5.74414, 6.528252, 49.44781, 50.18162  (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326) 
#> source      : memory 
#> name        :   NAME_2 
#> min value   : Capellen 
#> max value   :    Wiltz

terra::values(results, dataframe=TRUE)
#>              NAME_2
#> 1          Clervaux
#> 2          Clervaux
#> 3          Clervaux
#> 4              <NA>
#> 5              <NA>
#> 6             Wiltz
#> 7             Wiltz
#> 8           Vianden
#> 9           Vianden
#> 10             <NA>
#> 11          Redange
#> 12          Redange
#> 13           Mersch
#> 14       Echternach
#> 15       Echternach
#> 16         Capellen
#> 17         Capellen
#> 18       Luxembourg
#> 19     Grevenmacher
#> 20     Grevenmacher
#> 21 Esch-sur-Alzette
#> 22 Esch-sur-Alzette
#> 23 Esch-sur-Alzette
#> 24           Remich
#> 25           Remich

plot(results)
lines(r, col = "light gray")
lines(v)
points(rcc)

^{Created on 2022-02-10 by the reprex package (v2.0.1)}

This works perfectly and is a very elegant solution! Sorry for the mismatch between my code and figure, I played around with the resolution a bit and then copied the wrong figure. — paulsw, Feb 11 '22 at 08:23
@paulsw, thank you very much for your feedback. Glad that I could help. I wish you the best in your work. Cheers. — lovalery, Feb 11 '22 at 11:03

score 0 · Answer 3 · answered Jun 28 '22 at 15:10

Using exactextractr >= 0.9 (currently on GitHub), this can be done with

library(terra)
library(exactextractr)
library(sf)

v <- st_read(system.file("ex/lux.shp", package="terra"))
r <- rast(ext(v), ncols = 3, nrow = 3)
r <- rasterize_polygons(v, r)

plot(r)
plot(st_geometry(v), add=TRUE)

Rasterize polygons based on maximum overlap (using R packages terra or stars)

3 Answers3