1

I have a data frame called data and the column names are:

c("Server", "Date", "Host_CPU", "Used_Mem_Perc", "JVM1", "JVM2", 
"JVM3", "JVM4", "JVM5", "JVM6")

I need to be able to create a lm model between Host_CPU and column names that start with JVM. In this cased it would be something like this:

lm(data=data, Host_CPU~JVM1+JVM2+JVM3+JVM4+JVM5+JVM6)

but sometimes, I don't know how many column that start with JVM would be. I need to be able to read in the column names and built the lm model. Any ideas how I could do this in R?

user1471980
  • 10,127
  • 48
  • 136
  • 235
  • 2
    Something like `as.formula(paste0("Host_CPU", "~", paste(nm[startsWith(nm, "JVM")], collapse = "+")))` where `nm` are the names – Rich Scriven Dec 19 '16 at 21:08
  • 1
    Look at the `reformulate()` function for help building formulas. For example: `x<-c("Server", "Date", "Host_CPU", "Used_Mem_Perc", "JVM1", "JVM2", "JVM3", "JVM4", "JVM5", "JVM6"); reformulate(grep("^JVM", x, value=T), "Host_CUP")` – MrFlick Dec 19 '16 at 21:09

1 Answers1

4

You can use grep and reformulate.

reformulate(vars[grep("^JVM", vars)], vars[3])
Host_CPU ~ JVM1 + JVM2 + JVM3 + JVM4 + JVM5 + JVM6

So

lm(reformulate(vars[grep("^JVM", vars)], vars[3]), data=data)

data

vars <- c("Server", "Date", "Host_CPU", "Used_Mem_Perc", "JVM1", "JVM2", "JVM3",
          "JVM4", "JVM5", "JVM6")
lmo
  • 37,904
  • 9
  • 56
  • 69