I am working on a project in R, and have created a markdown document (for PDF) for it. The document literally contains the same exact code as the .R file; Both the R Files and the Rmd files are in the same directory.
The purpose of the project is to create a predictive rating system. There are a few algorithms the code examines, picks the best one, checks it on a validation data set, and gets an expected result, very close to (and actually lower) then what the last version of the algorithm produced.
However, when this code is run inside the .Rmd file, the results are fairly significantly off.
I have tried some of the things recommended here :
I have specifically set the working directory with the "here
" package and used save.image()
to save the current environment; I've loaded the file into the markdown document to load the environment - that didn't work.
I have used the rmarkdown::render()
command on my file to knit the file, after using the knitr::opts_chunk$set(cache =TRUE)
command.
When I run the str(options())
command in both, I get very different outputs. This is the only thing I can think of at this point.
How do I get the options()
to be the same in both files? Is there a setting for this in R Studio? I had thought that because both file are in the same location this wouldn't be a problem, but here I am...
EDIT: I'm enclosing the links to my github code:
These are processing millions of lines of code, so they will take time. In the end, the output from the R File will have the following (or something similar:) - notice that the final RMSE value on the validation set Is lower then all the rest.
|method | RMSE|
|:------------------------------------------------------------------------------|---------:|
|Just the average | 1.0429393|
|Just Movie Effect Model b_i average | 0.9390297|
|User Affect + Movie Effect Model | 0.8556187|
|Regularized + Movie Effect | 0.9390291|
|Regularized Movie + User Effect Model | 0.8555519|
|Regularized Movie + User Effect Model + Year Effect Model | 0.8552506|
|Regularized Movie + User Effect Model + Year Effect Model + Genre Effect Model | 0.8551794|
|Final Model Tested on Validation Set | 0.8550428|
The output from the .Rmd file inevitably looks like this - notice the last RMSE higher then the few previous ones:
method RMSE
1 Just the average 1.0406546
2 Movie Effect Model 0.9266068
3 User Effect + Movie Effect Model 0.8440959
4 Regularized + Movie Effect 0.9266057
5 Regularized Movie + User Effect Model 0.8440903
6 Regularized Movie + User + Year Effect Model 0.8439340
7 Regularized Movie + User + Year + Genre Effect Model 0.8438856
8 Final Model Tested on Validation Set 0.8451003
Any input would be greatly appreciated.
TIA