3

I have polygons where I want to calculate the percentage overlap between them. The idea is that when a polygon intersects another one, the percentage can be calculated on the perspective of one polygon or the other (i.e., the maximum value). Therefore, I want to make a script that generates percent coverage between polygons taking the percentage of overlap from one polygon and the other and them put all of the results in a data frame.

Here is the code that I have for the moment :

set.seed(131)
library(sf)
library(mapview)
m = rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0))
p = st_polygon(list(m))
n = 5
l = vector("list", n)
for (i in 1:n)
  l[[i]] = p + 2 * runif(2)
s = st_sfc(l)
s.f = st_sf(s)
s.f$id = c(1,1,2,2,3)
s.f.2 = s.f %>% group_by(id) %>%   summarise(geometry = sf::st_union(s)) # %>% summarise(area = st_area(s))

s.f.2$area = st_area(s.f.2)

i = s.f.2 %>% 
  st_intersection(.) %>% 
  mutate(intersect_area = st_area(.)) #%>%

st_intersection(s.f.2) %>% 
  mutate(intersect_area = st_area(.),
         # id.int = sapply(i$origins, function(x) paste0(as.character(hr.pol$id)[x], collapse = ", ")),
         id1 = sapply(i$origins, function(x) paste0(as.character(s.f.2$id)[x][1])),
         id2 = sapply(i$origins, function(x) paste0(as.character(s.f.2$id)[x][2])),
         area.id1 = sapply(i$origins, function(x) s.f.2$area[x][1]),
         area.id2 = sapply(i$origins, function(x) s.f.2$area[x][2]),
         perc1 = as.vector(intersect_area/area.id1),
         perc2 = as.vector(intersect_area/area.id2)) %>%   # create new column with shape area
  filter(n.overlaps ==2) %>% 
  dplyr::select(id, intersect_area, id1, id2, 
                # id.int, 
                perc1,perc2) %>%   # only select columns needed to merge
  st_drop_geometry() %>%  # drop geometry as we don't need it
  select(-id) %>% 
  pivot_longer(#names_prefix = "id", 
    names_to = "perc",
    cols = starts_with("perc"))

This code gives the percentage of overlap between the polygons (I'm doing it for only 2 overlap, but it would be nice if this is generalizable for more than one overlap!)

mapview(s.f.2,zcol = "id")

enter image description here

In the end, what I'm looking for is something like this :

id   `1`   `2`   `3`
1     100   31.6  0
2     27.0  100   0
3     0     0     100

So polygon "1" covers 31.6% of the area of polygon "2" and polygon "2" covers 27.0% of the area of polygon "1".

What I have at the moment is (but is very slow):

data.sp = s.f.2 %>%  
  st_as_sf(.) %>%
  mutate(area.m =  st_area(geometry),
         area.ha = units::set_units(area.m, ha)) %>%
  select(-c(area,area.m))

id.sort = sort(unique(data.sp$id)) # used to reorder columns based on ID

df.fill =data.frame(id1 = NULL, id2=NULL, area =NULL, over1 = NULL, over2 = NULL)

for (k in 1:length(id.sort)) {
  for (op in 1:length(id.sort)) {
    int.out = st_intersection(data.sp[data.sp$id==id.sort[k],], 
                              data.sp[data.sp$id==id.sort[op],])
    # int.out
    if(nrow(int.out) != 0) {
      area.tmp = st_area(int.out)#/10000
      over1 = area.tmp/int.out$area.ha
      over2 = area.tmp/int.out$area.ha.1
    } else {area.tmp = 0;over1=0;over2=0}
    
    df.fill.tmp = data.frame(id1 = id.sort[k], id2=id.sort[op], 
                             area = area.tmp,
                             over1 = over1*100,
                             over2 = over2*100)
    df.fill = rbind(df.fill,df.fill.tmp)
  }
}
df.fill$over1 = as.numeric(df.fill$over1)
df.fill$over2 = as.numeric(df.fill$over2)
df.fill %>% 
  select(-c(area, over2)) %>% 
  pivot_wider(names_from = id2,values_from = over1, 
              values_fill = 0)
M. Beausoleil
  • 3,141
  • 6
  • 29
  • 61
  • Hi @M. Beausoleil. Your question is very interesting. I tried to suggest you a solution (see answer below). I hope it will meet your needs. Cheers. – lovalery Nov 19 '21 at 00:53

4 Answers4

2

Simple question but not an obvious answer! The solution I suggest follows a slightly different strategy than yours and does not involve any for loops. First, I developed a function (i.e. area_cover_()) that produces a cross table with only those polygons that have at least one intersection. Then, in a second step, I developed another function (i.e. add_isolated_poly()) that adds the polygons with no intersection at the end of the cross table produced in the first step. This makes it easier to read the final table if you have many polygons without intersection. So, please find the reprex below.

NB: The input data for the reprex corresponds to your sf object s.f.2 with area column

Reprex

1. First step: create a cross table including only the polygons that have at least one intersection (not including the polygons without intersection makes the reading of the cross table more efficient). To do this, I developed the function area_cover()

  • Code of the area_cover() function
library(sf)
library(dplyr)
library(tidyr)


area_cover <- function(x) {
  x %>% 
  st_intersection() %>% 
  filter(n.overlaps>1) %>%
  mutate(area_inter = st_area(.)) %>%  
  unnest(., cols = c(origins, geometry)) %>% 
  left_join(., as.data.frame(x), by = c("origins" = "id")) %>% 
  mutate(cover_percent = area_inter/area.y*100) %>% 
  select(.,origins, area.y, area_inter, cover_percent) %>% 
  rename("id" = "origins", "area" = "area.y") %>% 
  st_drop_geometry() %>%  
  group_by(area_inter) %>% 
  mutate(poly_X_id = rev(id)) %>% 
  relocate(poly_X_id, .before = area_inter) %>% 
  xtabs(cover_percent ~ id + poly_X_id, data = .) %>%  
  replace(.== 0, 100) %>% 
  round(., digits = 1)
  }
  • Output of the area_cover() function
Results <- area_cover(s.f.2)

Results      # Only polygons with at least one intersection are present in this cross table
#>    poly_X_id
#> id      1     2
#>   1 100.0  31.6
#>   2  27.0 100.0  

class(Results) 
#> [1] "xtabs" "table"  # you can convert 'Results' into matrix with 'as.matrix()' if needed.

2. Second step (optional): add isolated polygons (i.e. with no intersection) at the end of the cross table 'Results' (i.e. result of the previous step). To do this, I developed the function add_isolated_poly() which creates a dataframe with n columns corresponding to the n id's of the isolated polygons and filled with 0s

  • Code of the add_isolated_poly() function
add_isolated_poly <- function(y, z){ # 'y arg.' = s.f.2 and 'z arg.' = result of the function area_cover()

id_isolated_poly <- setdiff(y$id, colnames(z))

df_isolated_poly <- y %>% 
  filter(.,id %in% id_isolated_poly) %>% 
  st_drop_geometry() %>% 
  select(., id) %>% 
  t() %>% 
  as.data.frame() %>% 
  `colnames<-`(., id_isolated_poly) %>% 
  rbind(.,rep(list(rep(0, nrow(y))), length(id_isolated_poly))) %>% 
  slice(-c(1))

cbind.fill <- function(...){
  nm <- list(...)
  nm <- lapply(nm, as.matrix)
  n <- max(sapply(nm, nrow))
  do.call(cbind, lapply(nm, function (x)
    rbind(x, matrix(0, n-nrow(x), ncol(x)))))
}

Results %>% 
  cbind.fill(., df_isolated_poly) %>%  
  replace(., col(.) == row(.), 100) %>%  
  `rownames<-`(., c(colnames(z), id_isolated_poly))
}
  • Output of the add_isolated_poly() function
Results_2 <- add_isolated_poly(s.f.2, Results)

Results_2
#>     1     2   3
#> 1 100  31.6   0
#> 2  27 100.0   0
#> 3   0   0.0 100

class(Results_2)
#> [1] "matrix" "array

Created on 2021-11-19 by the reprex package (v2.0.1)


IMPORTANT EDIT

Although they produce the right result with the proposed minimal example, the two functions I proposed above are not generalizable and produce wrong results. After a lot of trial and error, here is a simple, very fast function... and, this time, right!!! So, please find the reprex below.

Solution

  • Code of the area_cover() function
library(sf)
library(dplyr)

area_cover <- function(x){
  Results <- x %>% 
    st_intersection(.,.) %>% 
    mutate(area_inter = st_area(.),
           cover = area_inter/area.1*100) %>% 
    st_drop_geometry() %>% 
    xtabs(cover ~ id.1 + id, data = ., sparse = TRUE) %>%  
    round(., digits = 1) %>% 
    as.matrix(.) 
  
  names(dimnames(Results)) <- NULL
  
  return(Results)
}
  • Output of the area_cover() function
area_cover(s.f.2)

#>     1     2   3
#> 1 100  31.6   0
#> 2  27 100.0   0
#> 3   0   0.0 100

Created on 2021-11-22 by the reprex package (v2.0.1)

Benchmarking

I compared my function with the validated solution of @wibeasley based on the new dataset provided by @M. Beausoleil (see comments below).

For the comparison to be valid, I have slightly modified the @wibeasley's function so that the output is a matrix containing percentages (i.e. same output as my function)

  • Data
set.seed(1234)
m = rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0))
p = st_polygon(list(m))
n = 100
l = vector("list", n)
for (i in 1:n)
  l[[i]] = p + 7 * runif(2)
s = st_sfc(l)
s.f = st_sf(s)
s.f$id = c(1,1,2,2,3,4,5,5,6,7,7,8,9,8,7,3,4,5,5,6)
s.f.2 = s.f %>% group_by(id) %>%   summarise(geometry = sf::st_union(s)) # %>% summarise(area = st_area(s))
s.f.2$area = st_area(s.f.2)
  • Code to compare the two solutions
library(sf)
library(dplyr)
library(tidyr)
library(bench)

bench_results <- bench::mark(
  "wibeasley_validated_answer" = {  #!!!!!  NB: lighlty modified function for the comparison to be valid!!!!!
    wibeasley <- sf::st_intersection(s.f.2, s.f.2) %>% 
      dplyr::mutate(
        area       = sf::st_area(.),
        proportion = area / area.1 * 100
      ) %>%
      tibble::as_tibble() %>%
      dplyr::select(
        id_1 = id,
        id_2 = id.1,
        proportion,
      ) %>% 
      # tidyr::complete(id_1, id_2, fill = list(proportion = 0))
      tidyr::pivot_wider(
        names_from = id_1,
        values_from = proportion,
        values_fill = 0
      ) %>% 
      as.matrix(.,rownames.force = TRUE) %>% 
      `<-`(., .[,-c(1)]) %>%  
      round(.,1)
    },  
  "lovalery_answer" = {
    lovalery <- area_cover(s.f.2)
  },
  min_iterations = 1000,
  relative = TRUE, 
  check = TRUE)
  • Results of the benchmarking (relative values)
bench_results
#> # A tibble: 2 x 6
#>   expression                   min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <dbl>  <dbl>     <dbl>     <dbl>    <dbl>
#> 1 wibeasley_validated_answer  1.13   1.10      1          1       1.16
#> 2 lovalery_answer             1      1         1.10      24.1     1
  • Results of the benchmarking (absolute values)
bench_results
#> # A tibble: 2 x 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 wibeasley_validated_answer   46.4ms   49.1ms      20.2    2.54MB    1.11 
#> 2 lovalery_answer              41.5ms   43.7ms      22.7   61.09MB    0.969
  • Final checking that the two functions produce the same result
all.equal(wibeasley, lovalery)
#> [1] TRUE

wibeasley
#>       1     2     3     4     5     6     7     8     9
#> 1 100.0  19.0  13.3   4.6  22.6  21.2  12.8   3.7  11.6
#> 2  18.3 100.0  28.7  31.9  33.0  14.3  32.2  25.1   5.1
#> 3  14.6  32.7 100.0  23.3  35.5   7.2  28.8   7.0  13.7
#> 4   5.1  36.4  23.3 100.0  20.3  23.7  26.2  26.8  14.7
#> 5  13.0  19.8  18.7  10.7 100.0  10.3  27.0  15.6  12.4
#> 6  21.3  15.0   6.6  21.8  18.0 100.0  26.9  24.4  15.8
#> 7   9.5  24.8  19.5  17.7  34.7  19.9 100.0  11.8  17.9
#> 8   3.8  26.8   6.6  25.1  27.7  24.9  16.3 100.0   1.3
#> 9  22.1  10.0  23.7  25.5  41.0  30.0  46.0   2.3 100.0

lovalery
#>       1     2     3     4     5     6     7     8     9
#> 1 100.0  19.0  13.3   4.6  22.6  21.2  12.8   3.7  11.6
#> 2  18.3 100.0  28.7  31.9  33.0  14.3  32.2  25.1   5.1
#> 3  14.6  32.7 100.0  23.3  35.5   7.2  28.8   7.0  13.7
#> 4   5.1  36.4  23.3 100.0  20.3  23.7  26.2  26.8  14.7
#> 5  13.0  19.8  18.7  10.7 100.0  10.3  27.0  15.6  12.4
#> 6  21.3  15.0   6.6  21.8  18.0 100.0  26.9  24.4  15.8
#> 7   9.5  24.8  19.5  17.7  34.7  19.9 100.0  11.8  17.9
#> 8   3.8  26.8   6.6  25.1  27.7  24.9  16.3 100.0   1.3
#> 9  22.1  10.0  23.7  25.5  41.0  30.0  46.0   2.3 100.0

Created on 2021-11-22 by the reprex package (v2.0.1)

lovalery
  • 4,524
  • 3
  • 14
  • 28
  • Should `cell(2,1)=27` be equal to `cell(1,2)=31.6`? If they can be different, then ignore my answer because I misunderstood the goal. I thought the output should be a symmetric matrix. – wibeasley Nov 19 '21 at 13:43
  • 1
    Hi @wibeasley, I am not the OP but, yes, I think you misunderstood the goal. The desired result is the one you describe in your comment: it corresponds to the area of the intersected zone divided by the total area of each polygon. Last, I think you should put your comment under the OP's question or under your answer, not mine ;-) Cheers. – lovalery Nov 19 '21 at 14:00
  • you're right. I missed the OP's "In this end, what I'm looking for is something like this" sentence in between the code blocks. I'll fix mine. Thanks for straightening me out. – wibeasley Nov 20 '21 at 01:04
  • 1
    Indeed, @lovalery. The idea behind this computation is that the area of overlap might mean something different depending on the polygon that is compared. For example, if I have 2 animals that have territories (polygons) that overlap. The area overlapping between the 2 territories have a different percentage depending on the size of the territory of each animal. So it is interesting to see how much is covered by one animal and the other (maybe one is completely overlapping the territory of the other, but the second is only overlapping a fraction of the other animal. Does that make sense? – M. Beausoleil Nov 20 '21 at 01:21
  • Not sure why, but the `unnest(., cols = c(origins, geometry))` gives me an `Error: Input must be list of vectors`. But the column origins is indeed a list... – M. Beausoleil Nov 20 '21 at 01:34
  • Ok, updated the tidy package to 1.1.4 to make it work... but for the second part `Error: 1 components of "..." were not used. We detected these problematic arguments: * "value" Did you misspecify an argument? Run "rlang::last_error()" to see where the error occurred.` – M. Beausoleil Nov 20 '21 at 01:56
  • @lovalery, I get the same error. Should `replace_na(., value = 0)` be changed to `replace_na(., replace = 0)`? – wibeasley Nov 20 '21 at 04:49
  • 2
    Hi Mr. Beausoleil and @wibeasley, thank you very much for your feedback. To tell the truth I had a hard time reproducing the error you get because on my PC everything was working fine. That said, I think I've found a fix to make it run smoothly in all circumstances! I think @wibeasley's hypothesis is right: the problem is with the `replace_na()` function which expects a vector or a dataframe and not a matrix, which is what the `cbind.fill()` function returns. – lovalery Nov 20 '21 at 14:52
  • 2
    So I edited my answer above by modifying the `cbind.fill()` function to then do without the `replace_na()` function. Please let me know if this now works for you as well. Cheers. (NB: the solution I suggest should run quickly (provided it works!)) – lovalery Nov 20 '21 at 14:52
  • 1
    Works on for me now. – wibeasley Nov 20 '21 at 15:00
  • 1
    Hi @M. Beausoleil, I just realized that I had not tagged you in my last two or three comments. So this message is to fix that mistake! Therefore, please read my last comments just above and test my edited answer. It works now for wibeasley, so please let me know if it is the same for you (I keep my fingers crossed! ;-) ). Cheers. – lovalery Nov 20 '21 at 17:26
  • It does work! Thanks! So with the right package version and the edited answer, it runs super fast. But I think that @wibeasley is faster. I ran `library(microbenchmark); test=microbenchmark(perc.all.poly(s.f.2), add_isolated_poly(s.f.2, area_cover(s.f.2)));autoplot(test)` with `set.seed(1234);p = st_polygon(list(m));n = 100;l = vector("list", n);for (i in 1:n); l[[i]] = p + 7 * runif(2);s = st_sfc(l);s.f = st_sf(s);s.f$id = c(1,1,2,2,3,4,5,5,6,7,7,8,9,8,7,3,4,5,5,6);s.f.2 = s.f %>% group_by(id) %>% summarise(geometry = sf::st_union(s)) # %>% summarise(area = st_area(s))`. `autoplot(test)` – M. Beausoleil Nov 21 '21 at 14:29
  • This shows that @wibeasley's function is 5 times faster. – M. Beausoleil Nov 21 '21 at 14:29
  • Thank you very much for your feedback @M. Beausoleil. Glad to see that it works for you too. That said, I'm bowing low on the performance side ;-) Cheers. – lovalery Nov 21 '21 at 21:31
  • 1
    Hi @M. Beausoleil. While benchmarking on the new dataset, I realized that my solution was not giving the right results! In other words, my solution is not generalizable mainly because the result of the function st_intersection(s.f.2) is particularly difficult to manipulate. So I rewrote the whole code starting from st_intersection(s.f.2, s.f.2) which produces a result much easier to handle. – lovalery Nov 22 '21 at 01:11
  • 1
    The result is a very fast function... even a little bit faster than the solution proposed by @wibeasley (that said the difference is not very significant). So please find at the bottom of my answer a complete EDIT. I wanted to keep the original answer, certainly wrong but which allows to keep coherent our previous comments exchanges for a new reader. Moreover, keeping my mistakes seems to be a more intellectually honest approach. Cheers. – lovalery Nov 22 '21 at 01:13
2

Without a real example to benchmark, I'm not sure it's faster than your solution. But it's simpler and easier to understand (at least for my brain).

sf::st_intersection() is vectorized. So it will find & return all the intersections of the first & second argument for you. In this case, the two arguments are the same set of polygons.

sf::st_intersection(s.f.2, s.f.2) %>% 
  dplyr::mutate(
    area       = sf::st_area(.),
    proportion = area / area.1
  ) %>%
  tibble::as_tibble() %>%
  dplyr::select(
    id_1 = id,
    id_2 = id.1,
    proportion,
  ) %>% 
  # tidyr::complete(id_1, id_2, fill = list(proportion = 0))
  tidyr::pivot_wider(
    names_from = id_1,
    values_from = proportion,
    values_fill = 0
  )

Output:

# A tibble: 3 x 4
   id_2   `1`   `2`   `3`
  <dbl> <dbl> <dbl> <dbl>
1     1 1     0.316     0
2     2 0.270 1         0
3     3 0     0         1

Things to consider:

  • Keep the areas as proportions, instead of percentages. It's usually better for calculations later.
  • Stay long, and don't pivot. It's usually better for calculations later, because you can join on id_1 and id_2.
  • If you pivot wide, you might want it as a Matrix, instead of a data.frame.
M. Beausoleil
  • 3,141
  • 6
  • 29
  • 61
wibeasley
  • 5,000
  • 3
  • 34
  • 62
  • 1
    I don't have benchmarks for this, but I can compare with what I had before and this is way faster! Thanks! Be careful if there are some units in the area column: for some reason pivot_wider can't use units... – M. Beausoleil Nov 20 '21 at 03:22
  • These work fine for me for the example data, but not my own data (which is in the same format as the example data). I get the following error: Error in s2_geography_from_wkb(x, oriented = oriented, check = check) : Evaluation error: Found 20 features with invalid spherical geometry. [1] Loop 0 is not valid: Edge 302 is degenerate (duplicate vertex). My data is spatial in lat/long. perhaps this is the issue? – user303287 Jun 22 '22 at 18:33
1

There are a few things I'd optimize. I can't really evaluate the optimization with only three polygons, and I'm guessing the intersection calculation is the only real expensive part, so I'd start there.

You'll get an instant ~50% reduction if you don't calculate both (1) polygon A & B, and then (2) polygon B & A. In a sense, calculate only the upper triangle, and reuse/reflect the values to the lower triangle.


# I think you wanted to create 5 empty columns.  Use `numeric(0)` instead of `NULL`
df.fill=data.frame(id1=numeric(0), id2=numeric(0), area=numeric(0), over1=numeric(0), over2=numeric(0))

for (k in seq_along(id.sort)) {
  for (op in seq(from = k, to = length(id.sort), by = 1)) { # Avoid the lower triangle
    int.out = st_intersection(
      data.sp[data.sp$id==id.sort[k],], 
      data.sp[data.sp$id==id.sort[op],]
    )
    
    if(nrow(int.out) != 0) {
      area.tmp = st_area(int.out)#/10000
      over1 = area.tmp/int.out$area.ha
      over2 = area.tmp/int.out$area.ha.1 
    } else {area.tmp = 0;over1=0;over2=0}
    
    df.fill.tmp.upper = data.frame(id1 = id.sort[k], id2=id.sort[op], 
                                   area = area.tmp,
                                   over1 = over1,
                                   over2 = over2)
    df.fill.tmp.lower = data.frame(id1 = id.sort[op], id2=id.sort[k], 
                                   area = area.tmp,
                                   over1 = over2,
                                   over2 = over1)
    df.fill <- 
      if (k == op) rbind(df.fill, df.fill.tmp.upper)
      else         rbind(df.fill, df.fill.tmp.upper, df.fill.tmp.lower)
  }
}
df.fill %>% 
  dplyr::mutate(
    over1 = as.numeric(over1) * 100
    over2 = as.numeric(over2) * 100
  ) %>%
  select(-area, -over2) %>% 
  pivot_wider(
    names_from = id2,
    values_from = over1, 
    values_fill = 0
  )

Output:

# A tibble: 3 x 4
    id1   `1`   `2`   `3`
  <dbl> <dbl> <dbl> <dbl>
1     1 100    31.6     0
2     2  27.0 100       0
3     3   0     0     100
wibeasley
  • 5,000
  • 3
  • 34
  • 62
  • Nevermind about my comments on @lovalery's solution. Mine is producing the same asymmetric matrix. I was confusing things from last night. Thanks for your patience. The intersection only needs to be calculated once, and there are two division operations. Once by polygon A, and once by polygon B. – wibeasley Nov 20 '21 at 01:51
0

If the intersection calculation is really expensive compared to the rest of the code, this solution might be faster than my second one. Like @lovalery's solution, only one data.frame is passed to sf:st_intersection(). By my interpretation, the main difference is this solution stays long and uses a cross join to enumerate all the possible combinations, and pivots wide in the final step. It's easier (for me) to manipulate when things are long instead of wide.

intersections <-
  s.f.2 %>% 
  sf::st_intersection() %>% 
  dplyr::filter(2L <= n.overlaps) %>%
  dplyr::mutate(
    numerator = sf::st_area(.)
  ) %>%
  tibble::as_tibble() %>%
  tidyr::unnest_wider(origins) %>%
  dplyr::select(
    id_1 = `...1`, 
    id_2 = `...2`, 
    numerator
  )

denominators <- 
  s.f.2 %>% 
  tibble::as_tibble() %>% 
  dplyr::select(id, area) 
  
denominators %>% 
  dplyr::full_join(denominators, by = character()) %>% 
  dplyr::select(
    id_1 = id.x,
    id_2 = id.y,
    area = area.y
  ) %>% 
  dplyr::left_join(intersections, by = c("id_1" = "id_1",  "id_2" = "id_2")) %>%
  dplyr::left_join(intersections, by = c("id_1" = "id_2",  "id_2" = "id_1")) %>% 
  dplyr::mutate(
    numerator   = dplyr::coalesce(numerator.x, numerator.y, 0),
    proportion  = dplyr::if_else(id_1 == id_2, 1, numerator / area),
  ) %>% 
  dplyr::select(id_1, id_2, proportion) %>% 
  tidyr::pivot_wider(
    names_from  = id_1,
    values_from = proportion,
    values_fill = 0
  )

Output:

# A tibble: 3 × 4
   id_2   `1`   `2`   `3`
  <dbl> <dbl> <dbl> <dbl>
1     1 1     0.316     0
2     2 0.270 1         0
3     3 0     0         1
wibeasley
  • 5,000
  • 3
  • 34
  • 62