Apply different sf functions to groups in a data frame

Question

I'm trying to calculate different Euclidean buffers (one 400m and one 800m) within a single simple features data frame using a set of piped dplyr function calls. The buffer distance should be specified for each feature based on the value of a grouping variable. I could easily split the data frame based on known values of the grouping variable, but I'd like to make the method as general as possible.

The following code works, but obviously only returns a single 400m buffer for all groups:

library(sf)
library(dplyr)

set.seed(42)
nc <- st_read(system.file("shape/nc.shp", package="sf"))
nc$grp <- sample(c(0,1), replace = TRUE, size = 100)

nc_buff <- nc %>%
  group_by(grp) %>%
  st_transform(32119) %>%
  group_map(~ st_buffer(.x, 400))

Ideally I'd split the data frame, calculate each buffer, and return a single simple features data frame with both sets of buffers combined.

How can I return a single data frame that contains a 400m buffer for grp == 0 and an 800m buffer for grp == 1?

This is helpful. If I try this: `nc_buff <- nc %>% group_by(grp) %>% st_transform(32119) %>% group_modify(~ st_buffer(.x, case_when(.y == 0 ~ 400, .y == 1 ~ 800))) ` It seems like the geometry information in the resultant sf data frame gets lost. Can you post the code you used? — captain picard, Dec 26 '19 at 18:37

score 3 · Accepted Answer · answered Dec 27 '19 at 14:06

You do not actually need to group the data amd use group_map. You can infact pass a vector of buffer widths directly to st_buffer:

library(sf)
library(dplyr) 
set.seed(42)
nc <- st_read(system.file("/shape/nc.shp", package="sf")) %>% 
    st_transform(32119) %>%
    st_centroid() %>% 
    mutate(grp = sample(c(0,1), 100, replace = TRUE))

Here, I “create” the buffer width column on-the-fly

nc_buff <- nc %>%
  st_buffer(., ifelse(.$grp == 0, 4000, 8000))

plot(nc_buff["NAME"])

For more complex cases or impreved readability, you can also use a mutate to create the buffer column beforehand, using for example ifelse or case_when:

 nc_buff <- nc %>%
  mutate(buf_wdt = ifelse(.$grp == 0, 4000, 8000)) %>% 
  st_buffer(., .$buf_wdt) %>% 
  select(-buf_wdt)

HTH!

^{Created on 2019-12-27 by the reprex package (v0.3.0)}

This works brilliantly, thanks! Was not obvious to me that it was possible to pass a vector to `st_buffer()` but the help does say that dist defines the "buffer distance for all, or for each of the elements in x." — captain picard, Dec 27 '19 at 16:22

captain picard · Answer 2 · 2019-12-27T12:07:55.043

It looks like this does the trick. bind_rows() is pipeable but doesn't currently work correctly on simple features data frames.

library(sf)
library(dplyr)

set.seed(42)
nc <- st_read(system.file("shape/nc.shp", package="sf"))
nc$grp <- sample(c(0,1), replace = TRUE, size = 100)

nc_buff <- nc %>%
  group_by(grp) %>%
  st_transform(32119) %>%
  group_map(~ st_buffer(.x, case_when(.y$grp == 0 ~ 400, .y$grp == 1 ~ 800)))

# No way to bind_rows() with sf data frames yet so this extra step is needed
# https://github.com/r-spatial/sf/issues/798
nc_buff <- do.call(rbind, nc_buff)

Note: Similar reasoning for preferring do.call(rbind, ) over bind_rows() is given here: Convert a list of sf objects into one sf.

Apply different sf functions to groups in a data frame

2 Answers2