How to use purrr::map with dplyr::mutate and across in R

Question

I've looked at a couple of previous examples with mutate, across, and map but struggled to fully understand them. Apologies if this question is a duplicate. Here are the other two posts that may be relevant - Using mutate(across(...)) with purrr::map and purrr::pmap with dplyr::mutate.

Background:

I have a list of ten dataframes. All of them have a similar number of column and names. (Some may have one or two more.) My goal is to combine all the columns into one dataframe, and I plan to use bind_rows() or list_rbind().

Problem:

Because of the poor quality of the raw CSV data files, the same column in different files may be of a different class. As such, running bind_rows() returns this error.

Error in `bind_rows()`:
! Can't combine `..1$cfv` <character> and `..2$cfv` <double>.
Backtrace:
 1. data_list %>% bind_rows()
 2. dplyr::bind_rows(.)

Attempted solution:

Because I don't know for sure the class of each column and some dataframes may be missing a column, my thought to overcoming this problem is to first converting all columns to the character class, binding them together, and then converting the relevant columns back to numeric.

To convert all columns of all dataframes in the list to the character class, I thought to use mutate, across, and map.

This is the code.

data = data_list %>% 
  map(mutate(across(everything(), ~ as.character(.))))

However, it does not work and returns this error message.

Error in `across()`:
! Must only be used inside data-masking verbs like `mutate()`, `filter()`, and `map()`.
Backtrace:
 1. data_list %>% ...
 6. dplyr::across(everything(), ~as.character(.))

Question:

How do I use mutate(), across(), and map() together? Alternatively, better ways to combine the different dataframes in the list would be welcome, too.

Thanks in advance.

score 3 · Accepted Answer · answered May 05 '23 at 18:38

This is what you want. Remember that the fn argument of map is a function that will be applied to each element. That function should accept an argument, .i.e. the . in this line of code that represents the data frame.

data_list %>%
  map(~mutate(., across(everything(), as.character)))

In your attempt (which was close!), there is no argument in your function.

Here's a reprex.

library(tidyverse)

dat <- as_tibble(mtcars)

# what you want to do on one data frame
dat %>%
  mutate(across(everything(), as.character))
#> # A tibble: 32 × 11
#>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
#>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
#>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
#>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
#>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
#>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
#>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
#> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
#> # ℹ 22 more rows

data_list <- list(dat, dat, dat)

# applied to a list
data_list %>%
  map(~mutate(., across(everything(), as.character)))
#> [[1]]
#> # A tibble: 32 × 11
#>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
#>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
#>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
#>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
#>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
#>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
#>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
#> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
#> # ℹ 22 more rows
#> 
#> [[2]]
#> # A tibble: 32 × 11
#>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
#>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
#>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
#>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
#>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
#>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
#>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
#> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
#> # ℹ 22 more rows
#> 
#> [[3]]
#> # A tibble: 32 × 11
#>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
#>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
#>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
#>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
#>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
#>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
#>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
#> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
#> # ℹ 22 more rows

# depending on your taste, you might like to use the ... arguments instead
data_list %>%
  map(mutate, across(everything(), as.character))
#> [[1]]
#> # A tibble: 32 × 11
#>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
#>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
#>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
#>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
#>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
#>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
#>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
#> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
#> # ℹ 22 more rows
#> 
#> [[2]]
#> # A tibble: 32 × 11
#>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
#>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
#>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
#>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
#>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
#>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
#>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
#> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
#> # ℹ 22 more rows
#> 
#> [[3]]
#> # A tibble: 32 × 11
#>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
#>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
#>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
#>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
#>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
#>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
#>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
#> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
#> # ℹ 22 more rows

^{Created on 2023-05-05 with reprex v2.0.2}

Great, thank you, Arthur! May I ask further why there is the `.,` in `map(~mutate(., across(everything(), as.character)))`? I understand the `.` refers to the dataframe that's passed along. But, my confusion lies in that when I use `mutate` with `across` without `map`, I don't need to include the `.`, whereas in this scenario `.` is needed. Edit: To clarify, it would make sense to me if the `.` were outside `mutate`. E.g., `map(., ~mutate(., across(everything(), as.character)))` instead of `map(~mutate(., across(everything(), as.character)))`. I'm confused about why this is the case. — Tee, May 05 '23 at 22:26
It's because the pipe to passes the dataframe as the first argument to mutate in this code `df %>% mutate()`. But this code `data_list %>% map(~mutate())`, there is nothing being piped into mutate. — Arthur, May 08 '23 at 17:49
I see. Thank you. One more follow-up question. Why is it that the `.,` is not needed when using `mutate()` and `across()` together without `map()`, but is necessary when using `map()`? I.e., `mutate(across(everything(), as.character))` works but not `map(mutate(across(everything(), ~ as.character(.))))`? How does `mutate(across(everything(), as.character))` know what is the input without a `.`? — Tee, May 16 '23 at 17:54
`mutate(across(everything(), as.character))` is equivalent to `mutate(across(everything(), ~as.character()))`. They both will work. If you look at the `across` documentation, you'll see that the second argument `.fns` can accept a function itself `as.character`, a lambda function `~as.character(.)` or a list of functions. Why `map(mutate(across(everything() ...` doesn't work is for a different reason. For that reason, see my previous comment. The first argument to `mutate` needs to be a dataframe and there is no dataframe supplied to `mutate` in this code: `map(mutate(across(....`. — Arthur, May 17 '23 at 16:27

How to use purrr::map with dplyr::mutate and across in R

1 Answers1