0

I am experimenting with different regression models. My end goal is to have a nice easy to read dataframe with 3 columns:

model_results <- data.frame(name = character(),
                            rmse = numeric(),
                            r2 = numeric())

Then after running each model, add the corrosponding output to the dataframe and then, at the end, review and make some decisions on which model to use.

I tried this:

mod.spend_transactions.results <- list("mod.spend_transactions",
rsme(residuals(mod.spend_transactions)),
summary(mod.spend_transactions)$r.squared)

I tried using a list because I know vectors can only store one datatype (right?).

Output:

rbind(model_results, mod.spend_transactions.results)
  X.mod.spend_transactions. X12.6029444519635 X0.912505643567096
1    mod.spend_transactions          12.60294          0.9125056

Close but not what I wnated since the df names have been changed and I did not expect that.

So I tried vectors, which works but seems "clunky" in that I'm sure I could do this with writing less code:

vect_modname <- vector()
vect_rsme <- vector()
vect_r2 <- vector()

Then after running a model

  vect_modname <- c(vect_modname, "mod.spend_transactions")
  vect_rsme <- c(vect_rsme, rsme(residuals(mod.spend_transactions)))
  vect_r2 <- c(vect_r2, summary(mod.spend_transactions)$r.squared)

Then at the end of running all the models I'm testing out

data.frame(vect_modname, vect_rsme, vect_r2)

Again, the vector method does work. But is there a "better", more elegant way of doing this?

Doug Fir
  • 19,971
  • 47
  • 169
  • 299
  • 1
    It's helpful when asking a question to provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input so we can actually run the code. A data.frame with no rows is basically ignored in `rbind()`. It's best to generate a list of named lists, then `do.call("rbind", ...)` or `dplyr::bind_rows(...)` to bind them all together at the end, not one-by-one. It's never a good idea to build a data.frame one row at a time. – MrFlick Apr 09 '17 at 03:56
  • Basically see this question: http://stackoverflow.com/questions/29402528/append-data-frames-together-in-a-for-loop and this one http://stackoverflow.com/questions/2851327/convert-a-list-of-data-frames-into-one-data-frame or see here: http://stackoverflow.com/questions/3642535/creating-an-r-dataframe-row-by-row. Can't decide which is the best duplicate – MrFlick Apr 09 '17 at 04:00
  • Thanks! You basically answered my question in your first comment. I can create a df using bind_rows with lists at the end. Any drawbacks to building vectors line by line then data.frame(myvector1, myvector2) at the end? – Doug Fir Apr 09 '17 at 04:11
  • 1
    It's also very inefficient. Not a big deal if you are only adding a few rows, but generally not recommended. If you are building vectors, always best to preallocate ad demonstrated in the linked answers. – MrFlick Apr 09 '17 at 04:14

0 Answers0