Validating successful execution of R scripts from top to bottom in a clean session without errors

Question

I teach multiple R Courses where students have to submit an analysis they did on a specific dataset. As a first, (small) step towards reproducibility, I want to make sure the script they are submitting ran at least once on their local computer with the following constraints. It ran:

in a new session
from top to bottom
without any errors.

Until now, I've used RMarkdown / Quarto to "enforce" this. Rendering a R/Qmd file creates a new session for the code, which is executed from top to bottom, and by default errors halt the rendering process. The resulting output file (typically PDF or HTML) gives me confidence that the above requirements are met. (Of course, there are relatively easy ways to trick me, but these require malice, which I do not assume).

What I don't like about this approach is that the students are forced to work with R/Qmds rather than R-scripts. This generates additional overhead (e.g. when debugging) and sources of errors (issues related to RMarkdown / Quarto, rather than to the code).

Another approach would be to use targets, which is a great package but also generates much overhead that would completely overwhelm the students (who are R and programming novices).

Can anyone recommend a different, simple approach to my problem?

I would think of it as a unit test problem and check for successful complettion (as I actually do [with my students in an autograder environment](https://arxiv.org/abs/2003.06500)). So maybe return a code (0 is common for shell scripts) or mark the result by `touch`-ing a file (can do from R too) -- or even by running the script in question from a test runner. You can help yourself by making R pick ("warnings as errors" via `options() etc). — Dirk Eddelbuettel, Jun 26 '23 at 12:28
Would you be open to providing them with a wrapper around `Rscript` that they would have to use to run the script? If so, you could have this wrapper create a checksum based on the script contents before or after dispatching to R. You then have the students report that checksum to you, along with their script. — Konrad Rudolph, Jun 26 '23 at 12:29
One potential approach would be for the students to use [reprex](https://reprex.tidyverse.org/) to render their submissions. It forces a clean session, prints warnings/errors along with code and output, and it's relatively straightforward to use. There's an RStudio plugin to generate the formatted output, or you can copy the whole script/'pipeline' to the clipboard then run `reprex::reprex()` — jared_mamrot, Jun 26 '23 at 12:46
@DirkEddelbuettel: Are you suggesting something similar to Konrad, i.e. providing a wrapper around RScript? — Ratnanil, Jun 26 '23 at 13:08
@Ratnanil Guess what? I linked to a reference that answers that and much more :) So yes what we do is in fact run the student submission, usually submitted as a function, along with a reference function on input we can also randomize for both. The results are then compared (and aggregated) by test runners (we like `tintest`, others work too). You can wrap that into `eval()`, or a script via `Rscript` or `r`. Your setup, your call. — Dirk Eddelbuettel, Jun 26 '23 at 13:24
@KonradRudolph: I am open to using wrappers! However, am I correct in assuming that this approach only works if the students don't know how to generate a checksum themselves? — Ratnanil, Jun 26 '23 at 13:27
Sure thing @Ratnanil - I think it's a good option, but I wasn't sure if you'd already considered it and dismissed it for some reason. I'll add some potential cons along with the pros — jared_mamrot, Jun 28 '23 at 00:15

score 1 · Accepted Answer · answered Jun 28 '23 at 00:05

One potential solution is the students use reprex to render their submissions.

Pros:

runs their code in a clean session
prints warnings/errors along with the code
you can copy/paste their code, along with their output, and run it on your own computer 'as is' to confirm their submission is valid
relatively straightforward to use:
- installing the package installs an RStudio plugin
- you can copy your code / the whole script to the clipboard then run reprex::reprex(wd = ".", venue = "html") to get a rendered html file and a markdown file in your 'current working directory' for submission
- you can reprex an R script, e.g. reprex(input = "my_reprex.R")
if/when students come to stackoverflow for help, they'll know how to reprex their minimal reproducible example, making their code easier for us to run and ensuring they include the libraries they're using, eliminating typos, etc
in my opinion, the docs are very good; they are easy to understand and implement if you have a basic understanding of R
you can add the sessionInfo() to the reprex (session_info = TRUE) to help you support your students with troubleshooting/issues/bugs/etc

Cons:

html/markdown submission files can be edited (may invite attempts to cheat, although running their submission on your computer and getting a different checksum would catch this)
if students are using Rstudio cloud, or something similar, the steps are slightly different (explained in the docs, definitely not an insurmountable problem)
the reprex docs are very good, but 'first timers' / beginners may struggle with some of the concepts (you may need to explain some of the concepts in more detail)

Overall, I think it's a good option if the students read the docs and you provide clear instructions/expectations for the final submission file/s.

score 0 · Answer 2 · answered Jun 28 '23 at 20:33

0

Why not use the same method but just put all of their script in one code chunk? I typically give my students a Quarto template so that the header and all that is already configured as well.

answered Jun 28 '23 at 20:33

Jason Bohenek

51
3

That was my intitial idea as well, but I'm not too fan of it since the two main problems still hold. (1) R/Qmd generate additional overhead (e.g. when debugging) (2) R/Qmd add additional sources of errors (issues related to RMarkdown / Quarto, rather than to the code). – Ratnanil Jun 29 '23 at 13:35

Validating successful execution of R scripts from top to bottom in a clean session without errors

2 Answers2