0

I have the following data frame with many variables,

> head(fit_dat[,c(1:3)])
         var_a                   var_b                        var_c
1         1.14                  2.3815                       1.0606
2         0.83                  1.5818                       1.2450
3         0.92                  1.8848                       1.0606
4         0.96                  1.4596                       1.0606
5         1.16                  0.9677                       1.0248
6         0.81                  2.4058                       1.1189

I also have a vector with elements that correspond to each of the variables in my data frame by name

> g[c(1:3)]
                                var_a 
                            1.4020096 
                                var_b
                            0.9118361 
                                var_c
                            1.2868801 

I want to mutate every column of my data frame without naming all of the many columns that it has, and I want to do this dynamically such that the variables names are used inside the ~function. I attempt to do this with the following but it doesnt work. How could I accomplish this without using joins, loops or naming every variable?

And more generally, I've been wondering, if I insert a such function in mutate_all, what is passed to that function in any one computation ?

library(tidyverse)
fit_dat %>% mutate_all(list(z = ~ . * g[colnames(.)])) # this `colnames` call is the problem!

Thank you!

Voy
  • 99
  • 4
  • 2
    Please provide a [reproducible minimal example](https://stackoverflow.com/q/5963269/8107362). Especially, please provide your example data, e.g. with `dput()` – mnist Nov 05 '19 at 22:18
  • 1
    *"I want to mutate"* could mean anything. When you follow @wusel's advice (provide sample data and minimal functional code), please don't forget to include your expected output *given your sample data*. – r2evans Nov 05 '19 at 23:14

2 Answers2

0

If I have understood you correctly, you want to multiply all the columns with the respective vector. You can do this directly in base R, without using any libraries by taking a subset of g based on names of fit_dat and multiplying it with fit_dat.

t(t(fit_dat) * g[names(fit_dat)])

#  var_a  var_b var_c
#1 1.598 2.1715 1.365
#2 1.164 1.4423 1.602
#3 1.290 1.7186 1.365
#4 1.346 1.3309 1.365
#5 1.626 0.8824 1.319
#6 1.136 2.1937 1.440

data

fit_dat <- structure(list(var_a = c(1.14, 0.83, 0.92, 0.96, 1.16, 0.81), 
var_b = c(2.3815, 1.5818, 1.8848, 1.4596, 0.9677, 2.4058), 
var_c = c(1.0606, 1.245, 1.0606, 1.0606, 1.0248, 1.1189)), 
class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6"))
g <- c(var_a = 1.4020096 , var_b = 0.9118361, var_c = 1.2868801)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

If you want to use the tidyverse, your data should be in tidy (long) format, which is what its functions expect. Here's one solution:

fit_data %>% 
  mutate(id = row_number()) %>% 
  pivot_longer(1:3, names_to = "var", values_to = "fit") %>% 
  mutate(fit = fit * g[var]) %>% 
  pivot_wider(names_from = "var", values_from = "fit") %>% 
  select(-id)

You'll need the id column in order to pivot it back to the wide form that it starts with (otherwise it won't know what to do with the non-unique variable names). Tidyverse may not be the best solution in this case--Ronak's method with transpose obviously requires less code--but if you need more complex mutations, here at least is an example of how you would get it in that format.

GenesRus
  • 1,057
  • 6
  • 16