1

I have a dataframe with columns Time_points_secs, Treatment and Pellet.

I want to test for normality before running statistics and then produce a line graph. I am creating a function so that i can repeat the same code for other columns within the dataframe (e.g. Pellet_count, etc.).

My function is:

line_graph<-function(var){

Normality<- df %>%
    group_by(Treatment, Time_point_secs) %>%
    filter(n_distinct(.data[[var]]) > 1) %>%
    shapiro_test(.data[[var]]) #rstatix package
  
  return(Normality)
 }

line_graph("Pellet")

But i get an error saying:

Error in `mutate()`:
ℹ In argument: `data = map(.data$data, .f, ...)`.
Caused by error in `map()`:
ℹ In index: 1.
Caused by error in `select()`:
! Can't subset columns that don't exist.
✖ Column `.data[["Pellet"]]` doesn't exist.

I've tried [[var]],{{var}} but neither works.

JLit98
  • 13
  • 5

1 Answers1

0

Embracing ({{var}}) should work without any issues, though you might have called the function with a character vector as in your example (line_graph("Pellet")) instead of using unquoted data-variable (line_graph(Pellet)).

And shapiro_test(.data[[var]]) fails as it expects "/../ One or more unquoted expressions (or variable names) separated by commas /../" for ..., but shapiro_test(.data[[var]]) apparently gets parsed as shapiro_test(vars = '.data[[var]]').

So either use embrace and pass the argument without quotes or adjust shapiro_test() to use vars parameter:

library(rstatix)
library(dplyr)

# use env-variable, call with character vector: f_envvar("var")
f_envvar<-function(var){
  mtcars %>%
    group_by(am, gear) %>%
    filter(n_distinct(.data[[var]]) > 1) %>%
    shapiro_test(vars = var) #rstatix package
}

# use data-variable, call with unquoted promise: f_embrace(var)
f_embrace<-function(var){
  mtcars %>%
    group_by(am, gear) %>%
    filter(n_distinct({{var}}) > 1) %>%
    shapiro_test({{var}}) #rstatix package
}

norm_envvar <- f_envvar("vs")
norm_embrce <- f_embrace(vs)
norm_envvar
#> # A tibble: 3 × 5
#>      am  gear variable statistic          p
#>   <dbl> <dbl> <chr>        <dbl>      <dbl>
#> 1     0     3 vs           0.499 0.00000348
#> 2     1     4 vs           0.566 0.0000632 
#> 3     1     5 vs           0.552 0.000131

# check if identical:
identical(norm_envvar, norm_embrce)
#> [1] TRUE

tibble(mtcars)
#> # A tibble: 32 × 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
# ...
#> # ℹ 22 more rows

Backtrace for original approach with shapiro_test(.data[[var]]):

line_graph<-function(var){
  
  Normality<- mtcars %>%
    group_by(am, gear) %>%
    filter(n_distinct(.data[[var]]) > 1) %>%
    shapiro_test(.data[[var]]) #rstatix package
  
  return(Normality)
}

line_graph("vs")
#> Error in `mutate()`:
#> ℹ In argument: `data = map(.data$data, .f, ...)`.
#> Caused by error in `map()`:
#> ℹ In index: 1.
#> Caused by error in `select()`:
#> ! Can't subset columns that don't exist.
#> ✖ Column `.data[["vs"]]` doesn't exist.
#> Backtrace:
#>      ▆
#>   1. ├─global line_graph("vs")
#>   2. │ └─... %>% shapiro_test(.data[[var]])
#>   3. ├─rstatix::shapiro_test(., .data[[var]])
#>   4. │ └─data %>% doo(shapiro_test, ..., vars = vars)
#>   5. ├─rstatix::doo(., shapiro_test, ..., vars = vars)
#>   6. │ └─... %>% mutate(data = map(.data$data, .f, ...))
#>   7. ├─dplyr::mutate(., data = map(.data$data, .f, ...))
#>   8. ├─dplyr:::mutate.data.frame(., data = map(.data$data, .f, ...))
#>   9. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#>  10. │   ├─base::withCallingHandlers(...)
#>  11. │   └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#>  12. │     └─mask$eval_all_mutate(quo)
#>  13. │       └─dplyr (local) eval()
#>  14. ├─purrr::map(.data$data, .f, ...)
#>  15. │ └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#>  16. │   ├─purrr:::with_indexed_errors(...)
#>  17. │   │ └─base::withCallingHandlers(...)
#>  18. │   ├─purrr:::call_with_cleanup(...)
#>  19. │   └─rstatix (local) .f(.x[[i]], ...)
#>  20. │     └─data %>% select(!!!syms(vars))
#>  21. ├─dplyr::select(., !!!syms(vars))
#>  22. ├─dplyr:::select.data.frame(., !!!syms(vars))
#>  23. │ └─tidyselect::eval_select(expr(c(...)), data = .data, error_call = error_call)
#>  24. │   └─tidyselect:::eval_select_impl(...)
#>  25. │     ├─tidyselect:::with_subscript_errors(...)
#>  26. │     │ └─rlang::try_fetch(...)
#>  27. │     │   └─base::withCallingHandlers(...)
#>  28. │     └─tidyselect:::vars_select_eval(...)
#>  29. │       └─tidyselect:::walk_data_tree(expr, data_mask, context_mask)
#>  30. │         └─tidyselect:::eval_c(expr, data_mask, context_mask)
#>  31. │           └─tidyselect:::reduce_sels(node, data_mask, context_mask, init = init)
#>  32. │             └─tidyselect:::walk_data_tree(new, data_mask, context_mask)
#>  33. │               └─tidyselect:::as_indices_sel_impl(...)
#>  34. │                 └─tidyselect:::as_indices_impl(...)
#>  35. │                   └─tidyselect:::chr_as_locations(x, vars, call = call, arg = arg)
#>  36. │                     └─vctrs::vec_as_location(...)
#>  37. └─vctrs (local) `<fn>`()
#>  38.   └─vctrs:::stop_subscript_oob(...)
#>  39.     └─vctrs:::stop_subscript(...)
#>  40.       └─rlang::abort(...)

Created on 2023-06-27 with reprex v2.0.2

margusl
  • 7,804
  • 2
  • 16
  • 20