0

I have a dataframe which looks like this

var1<-c(150,20,30)
var2<-c(20,30,40)
x<-c(2.5,4.5,7.5)
s_var1<-c(0,0,0)
s_var2<-c(0,0,0)

data<-data.frame(var1, var2, x, s_var1, s_var2)

There could be a whole bunch of 'var' columns - var1, var2....var'n'. Same with s_var1, s_var2.....etc.

I want to write a function which does calculations on the 's_var' columns while referencing the 'var' columns and the 'x' column.

For example: if there are 2 'var' columns

n_var<-c(1,2)

for (i in n_var)
{
if (x > 2.5) { s_var[i] = var[i] } else {s_var[i] = 2*var[i]}
}

Any suggestions? I am struggling to pass the numbers in the list as suffixes to reference column names...

  • Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – Sotos Feb 08 '18 at 12:42
  • Why are you also creating `s_var1` and `s_var2`? Something like `cbind(df, setNames(data.frame(sapply(df[grepl('var' ,names(df))], function(i) ifelse(df$x > 2.5, i, 2*i))), paste0('s_', names(df)[grepl('var', names(df))])))` should work fine. – Sotos Feb 08 '18 at 13:04
  • use `get()` after `paste()` to construct variable names – abhiieor Feb 08 '18 at 13:05

1 Answers1

0

Use grep to find the column indexes of the var and s_var columns and define a multiplier which depends on x. Finally apply the multiplier. Note that the multiplier will be recycled to multiply each of the columns in data[var_cols].

fun <- function(data) {
  var_cols <- grep("^var", names(data))
  s_var_cols <- grep("^s_var", names(data))
  multiplier <- (x > 2.5) + 1
  data[s_var_cols] <- multiplier * data[var_cols]  ##
  data
}

fun(data)

giving:

  var1 var2   x s_var1 s_var2
1  150   20 2.5    150     20
2   20   30 4.5     40     60
3   30   40 7.5     60     80

The ## line above could also be written like this, which is not really needed here but if the calculation were more complex then using lapply or mapply might be needed.

data[s_var_cols] <- lapply(data[var_cols], `*`, multiplier)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341