2

I am trying to "center" multiple columns in a dataframe using dplyr but I keep getting a "non-numeric argument to binary operator" evaluation error. I think it is because I am trying to pass a string in when my function expects a bare variable name. However, using the syms() function does not help.

center <- function(var) {
  var <- enquo(var)
  var_ctrd <- paste0(quo_name(var), "_ctrd")
  dataset <- dataset %>% 
    group_by(Gender) %>% 
    mutate(!! var_ctrd := !! var - mean(!! var, na.rm = TRUE))
}

# Pull out character vector of modifier names
mod_names <- dataset %>% 
  select(NeckLengthCm:FlexExtDiff_Peak_abs) %>% 
  colnames()

# Iterate over modifiers
walk(syms(mod_names), center)

Does anyone know how to solve this or if there is a better solution?

alistaire
  • 42,459
  • 4
  • 77
  • 117
Hank Lin
  • 5,959
  • 2
  • 10
  • 17
  • Incidentally, why not use the existing `scale` function to center the values? – Konrad Rudolph Oct 30 '18 at 23:26
  • It doesn’t require a matrix but it indeed *returns* a matrix. That’s mildly idiotic. You could do `as.vector(scale(!! mean, scale = FALSE))` but it’s hard to justify that over your code. – Konrad Rudolph Oct 30 '18 at 23:31
  • Actually I think I will do that since it is a little shorter lol. Do you know how to iterate over a character vector and apply the function though? – Hank Lin Oct 30 '18 at 23:40
  • @hlinee: Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use `str()`, `head()` or screenshot)? You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to assist you with that. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Oct 31 '18 at 05:11

2 Answers2

5

You can use mutate_at() to center a subset of variables using a vector of variable names

# Only center a subset
vars <- colnames(mtcars)[1:4]

mtcars %>% 
  mutate_at(vars, scale, scale = FALSE)
TClavelle
  • 578
  • 4
  • 12
  • That worked- thank you! However, I was hoping to not modify the variables directly but instead create new variables with the suffix "_ctrd" . Is there any way to do that by extending this code? – Hank Lin Oct 30 '18 at 23:47
  • You could assign the centered data frame a different name, modify the centered variable names, and then use `bind_cols()` to bind it to the original. Not the cleanest solution though. – TClavelle Oct 31 '18 at 00:05
  • Since `mutate_at` has been superseded, the new code should read `mtcars %>% mutate(across(vars, ~scale(., center = T, scale = T))`. I extended the `scale` command to include the arguments but in this format it does the same as `mtcars %>% mutate(across(vars, scale))`; centering subtracts the column means from each value and `scale = T` then divides by the standard deviations – JJGabe Oct 04 '21 at 16:18
  • @TClavelle as a follow-up, the dataframe that this returns "attaches" the attributes `scaled:center` and `scaled:scale` to each variable (so each variable is actually a list/matrix) and then I am unable to write that to a csv file. Do you know how I can unattach those attributes? – JJGabe Oct 04 '21 at 17:40
  • From another post I found my answer: `mtcars %>% mutate(across(everything(), as.vector))`...do this after scaling – JJGabe Oct 04 '21 at 20:02
0

I have another suggestion within the mutate_at:

mtcars %>% 
  mutate_at(.vars = colnames(mtcars)[1:4], 
            .funs = list("scaled" = scale))

Here you get exactly what you wanted. Scaled variables in new columns with suffix.

   mpg cyl disp  hp drat    wt  qsec vs am gear carb mpg_scaled cyl_scaled disp_scaled  hp_scaled
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4  0.1508848 -0.1049878 -0.57061982 -0.5350928
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4  0.1508848 -0.1049878 -0.57061982 -0.5350928
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1  0.4495434 -1.2248578 -0.99018209 -0.7830405
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1  0.2172534 -0.1049878  0.22009369 -0.5350928
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 -0.2307345  1.0148821  1.04308123  0.4129422
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 -0.3302874 -0.1049878 -0.04616698 -0.6080186
Marco
  • 2,368
  • 6
  • 22
  • 48