The Writing R Extensions manual states:
The data subdirectory is for data files, either to be made available via lazy-loading or for loading using data(). (The choice is made by the ‘LazyData’ field in the DESCRIPTION file: the default is not to do so.) It should not be used for other data files needed by the package, and the convention has grown up to use directory inst/extdata for such files.)
But it is still not clear what data is "required" by a package. I would like to use data for the following (not always mutually exclusive) reasons:
- documentation
- function examples
- function tests
- vignettes
- to provide access to an original data set
- to make data available to functions within the package (e.g. a lookup table / dictionary)
But it is not clear which of these should go in the data
folder, and which should go in inst/extdata
. And are there any conditions under which "data" should go elsewhere?
Related questions: Previous questions (e.g. inst and extdata folders in R Packaging and Using inst/extdata with vignette during package checking R 2.14.0) give some instructions on use, but don't tell me how to decide which directory to use. Another question, R - where should I place RDA file - /R, /data, /inst/extdata?, gets the closest, but seems to focus specifically on RDA and RData files.