For conventional scenarios, I prefer for readr to loop through the csvs by itself. But there some scenarios where it helps to process files individually before stacking them together.
A few weeks ago, purrr 1.0's map_dfr()
function was "superseded in favour of using the
appropriate map function along with list_rbind()".
@zephryl's snippet is slightly modified to become
list.files() |>
rlang::set_names() |>
purrr::map(readr::read_delim) |>
# { possibly process files here before stacking/binding } |>
purrr::list_rbind(names_to = "file")
The functions were superseded in purrr 1.0.0 because their names suggest they work like _lgl()
, _int()
, etc which require length 1 outputs, but actually they return results of any size because the results are combined without any size checks. Additionally, they use dplyr::bind_rows()
and dplyr::bind_cols()
which require dplyr to be installed and have confusing semantics with edge cases. Superseded functions will not go away, but will only receive critical bug fixes.
Instead, we recommend using map()
, map2()
, etc with list_rbind()
and list_cbind()
. These use vctrs::vec_rbind()
and vctrs::vec_cbind()
under the hood, and have names that more clearly reflect their semantics.
Source: https://purrr.tidyverse.org/reference/map_dfr.html