2

I want to apply multiple functions to the same dataframe. However, I have not been able to successfully pass column names as a parameter in purrr::imap. I keep get the following error:

Error in UseMethod("select") : no applicable method for 'select' applied to an object of class "character"

I have tried many combinations for evaluation (e.g., using !!!, [[, enquo, sys.lang, and on and on). when I apply a function (e.g., check_1) directly to a dataframe, select works fine. However, it does not work when I try to pass column names as a parameter using imap and exec.The format of the column name is part of the issue (e.g., 1.1.), but I have tried quotes and single quotes, etc.

This is a follow up to a previous post, but that post and solution focused on applying multiple functions to individual columns. Now, I need to apply multiple functions, which use more than one column in the dataframe; hence, the need to specify column names in a function.

Minimal Example

Data

df <- structure(
  list(
    `1.1.` = c("Andrew", "Max", "Sylvia", NA, "1",
               NA, NA, "Jason"),
    `1.2.` = c(1, 2, 2, NA, 4, 5, 3, NA),
    `1.2.1.` = c(
      "cool", "amazing", "wonderful", "okay",
      NA, NA, "chocolate", "fine"
    )
  ),
  class = "data.frame",
  row.names = c(NA, -8L)
)

What I have Tried

library(purrr)
library(dplyr)

check_1 <- function(x, col1, col2) {
  x %>%
    dplyr::select(col1, col2) %>%
    dplyr::mutate(row.index = row_number()) %>%
    dplyr::filter(col1 == "Jason" & is.na(col2) == TRUE) %>%
    dplyr::select(row.index) %>%
    unlist() %>%
    as.vector()
}

check_2 <- function(x, col1, col2) {
  index <- x %>%
    dplyr::select(col1, col2) %>%
    dplyr::mutate(row.index = row_number()) %>%
    dplyr::filter(col1 >= 3 & col1 <= 5 & is.na(col2) == TRUE) %>%
    dplyr::select(row.index) %>%
    unlist() %>%
    as.vector()
  return(index)
}

checks <-
  list("df" = list(fn = check_1, pars = list(col1 = "1.1.", col2 = "1.2.")),
       "df" = list(fn = check_2, pars = list(col1 = "1.2.", col2 = "1.2.1.")))

results <-
  purrr::imap(checks, ~ exec(.x$fn, x = .y,!!!.x$pars))

Expected Output

> results
$df
[1] 8

$df
[1] 5 6

Besides the "class character" error, I also get an additional error when I try to test the check_2 function on its own, where it returns no expected values.

[1] 1.2.      1.2.1.    row.index
<0 rows> (or 0-length row.names)

I have looked at many other similar SO posts (e.g., this one), but none have solved this issue for me.

AndrewGB
  • 16,126
  • 5
  • 18
  • 49

2 Answers2

4

The first issue is that you pass the name of the dataframe but not the the dataframe itself. That's why you get the first error as you are trying to select from a character string. To solve this issue add the dataframe to the list you are looping over.

The second issue is that when you pass the column names as character string you have to tell dplyr that these characters refer to columns in your data. This could be achieved by e.g. making use of the .data pronoun.

Finally, instead of select + unlist + as.vector you could simply use dplyr::pull:

library(purrr)
library(dplyr)

check_1 <- function(x, col1, col2) {
  x %>%
    dplyr::select(all_of(c(col1, col2))) %>%
    dplyr::mutate(row.index = row_number()) %>%
    dplyr::filter(.data[[col1]] == "Jason" & is.na(.data[[col2]]) == TRUE) %>%
    dplyr::pull(row.index)
}

check_2 <- function(x, col1, col2) {
  x %>%
    dplyr::select(all_of(c(col1, col2))) %>% 
    dplyr::mutate(row.index = row_number()) %>%
    dplyr::filter(.data[[col1]] >= 3 & .data[[col1]] <= 5 & is.na(.data[[col2]]) == TRUE) %>%
    dplyr::pull(row.index)
}

checks <-
  list(df = list(df = df, fn = check_1, pars = list(col1 = "1.1.", col2 = "1.2.")),
       df = list(df = df, fn = check_2, pars = list(col1 = "1.2.", col2 = "1.2.1.")))

purrr::map(checks, ~ exec(.x$fn, x = .x$df, !!!.x$pars))
#> $df
#> [1] 8
#> 
#> $df
#> [1] 5 6
stefan
  • 90,330
  • 6
  • 25
  • 51
  • 1
    Good suggestions! My one comment is that if you are using the same `df` throughout, presumably used above in the code, then you don't have to add it repeatedly to the list. Just do `purrr::map(checks, ~ exec(.x$fn, x = df, !!!.x$pars))` and it should pass it along. Depends on the use case, like usual. –  Aug 11 '21 at 14:53
-4

Use select({{col1}},{{col2}}) this most probably help you

MaxMiak
  • 197
  • 1
  • 7