57

I am using Hadley's testthat-based approach for automated testing of my package.

With this approach, what is the most suitable place to put test data files that is files only used by the test scripts in tests/testthat), but not by any other functions in R/?

My current approach is to put them in tests/testdata, and then read.table from there with a relative path rather than with system.file (in order to avoid the need to install the package to run tests).

Is there a standard way to do this?

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
Aditya
  • 1,554
  • 1
  • 13
  • 23
  • 5
    Yes, I think your approach with putting the data into `/tests/testdata/` and then loading via, e.g., `read.csv("../testdata/test1.csv")` is better. I have checked that the extra files in the folder `testdata` also get copied into the library directory after you have built & installed the package with flag `--install-tests`. The latter is important because the tests should be distributed with the package IMHO. – cryo111 Sep 01 '15 at 17:19
  • 22
    I use `inst/testdata` and then `system.file("testdata",...,package="my_package")` – Ben Bolker Sep 01 '15 at 18:10
  • @BenBolker how do you ensure the data is loaded during `devtools::check()`? – alexwhitworth May 28 '20 at 01:58
  • I'm not sure it is. I'm 99.9% certain it works with `R CMD check`, not so sure about `devtools::check()`. – Ben Bolker May 28 '20 at 14:19
  • @BenBolker I am trying your approach but can't make it work. I placed two RDS objects in that folder (`inst/testdata/`) which I want to use in my tests. How do I access them using this `system.file("testdata", package="my_package")` approach? I want to read one of the objects back into a variable using `readRDS()` for instance. – Faustin Gashakamba Sep 19 '21 at 16:13
  • can you be more specific about what "can't make it work" means? If it's too complicated to explain in comments, you can go ahead and post a new question that links back to this one. (If you have an RDS file `a.rds`, then `a <- readRDS(system.file("testdata", "a.rds", package="my_package"))` *should* work via `R CMD check` or `devtools::check()`) – Ben Bolker Sep 19 '21 at 22:13

4 Answers4

45

Lifting from Ben Bolker's comments:

I use inst/testdata and then system.file("testdata",...,package="my_package")

The advantage of this method:

  • You can keep your file structure neat, especially if you have many data files and/or tests.
  • The fact that files in inst are installed is long-standing canonical R practice; it seems safer that system.file("testdata", "some_file") will always work than that ../testdata/some_file will do. I've had bad experiences using relative file paths when doing R CMD check.
  • Unlike Sathish's answer, it doesn't depend on your data being "stored" as R code.
21

I had the same problem. I filed an in github.com/r-lib/devtools and one of the developers (Jenny Bryan) could help with this!

The solution is to put all data for testing into "tests/testthat" or some subdirectory of it. In your tests, you can provide the paths using testthat::test_path(). Using this approach, the test work in both ways, interactively AND in R-CMD-CHECK or devtools::check()!

Example:

Package structure

└──pkg_name/
    ├── DESCRIPTION
    ...
    └── tests/
        ├── testthat.R
        └── testthat/
            ├── test-some_function.R
            └── testdata
                ├── file_1.csv
                └── file_2.tif

test-some_function.R

test_that("testname", {
  expect_equal(
    some_function(
      test_path("testdata", "file_2.tif")
      ),
    ...
  )
})
MxNl
  • 371
  • 2
  • 9
19
  • tests are kept inside a file which is prefixed with 'test_'
  • data are kept inside files prefixed with 'helper_'

Package Directory and File Structure:

└──pkg_name/
    ├── DESCRIPTION
    ├── NAMESPACE
    ├──.Rbuildignore
    ├── data/
    ├── man/
    ├── R/
    ├── vignettes/
    └── tests/
        ├── testthat.R
        └── testthat/
             └── helper_myfunc1.R
             └── helper_myfunc2.R
             └── test_pkg_name.R

testthat.R

library(testthat)
library(pkg_name)
test_check("pkg_name")

helper_myfunc1.R contains data for testing myfunc1 function

a1 <- 2
a2 <- 2
b1 <- 2*3
b2 <- 6

helper_myfunc2.R contains data for testing myfunc2 function

c1 <- 50/2
c2 <- 25
d1 <- c(2,3)
d2 <- c(2,3)

test_pkg_name.R contains tests for functions and other objects in the package

context('pkg_name_functions')

test_that('myfunc1',
          {
            expect_identical(a1, a2)
            expect_identical(b1, b2)
          })

test_that('myfunc2',
          {
            expect_identical(c1, c2)
            expect_identical(d1, d2)
          })

Conduct unit testing

library("devtools")

devtools::load_all()
# Loading pkg_name

devtools::test()
# Loading pkg_name
# Testing pkg_name
# pkg_name_functions: ....

# DONE ================================================================
Sathish
  • 12,453
  • 3
  • 41
  • 59
  • 1
    how do you tell your test script test_pkg_name.R where the data is? As your answer is now, you should get an error that e.g. a1 could not be found (assuming you directly run `devtools::test()` without `devtools::load_all()`? I often only run the test as suggested by Hadley in his simple [workflow illustration](http://r-pkgs.had.co.nz/tests.html) – Triamus Jan 18 '17 at 10:30
  • This is fine answer, though there is no filter on helper data by the test description i.e. all variables defined in `helper_*.R` files will be available for all tests: `test_that('myfunc2', ...` will be able to access `a1, a2, b1, b2` etc. – Linards Kalvans Nov 30 '18 at 09:32
  • how do you ensure the data is loaded during `devtools::check()`? – alexwhitworth May 28 '20 at 01:59
12

The Data chapter of the same R-Pkgs book says "it’s ok to put small files directly in your test directory". That's what I've done in the past. And it sounds like that's what you're already doing, plus the testdata directory.

Dylan
  • 745
  • 8
  • 10