6

My package includes raw data in .csv files, but I want them to be processed using an R script placed in the data/ directory. I've placed the raw data files in inst/extdata.

data_locs = c(file.path("..", "inst", "extdata"),
              file.path("..", "extdata"),
              file.path("extdata"),
              file.path("inst", "extdata"))

data_loc = data_locs[file.exists(data_locs)]

files = file.path(data_loc, 
                  list.files(data_loc, pattern=".*\\.csv"))

datalist = lapply(pubtime_files, utils::read.csv)

data = do.call(rbind, datalist)

rm(datalist, files, data_loc, data_locs)

I use the multiple data_locs because the working directory used when roxygenizing is different than when building the package, but even with this, list.files doesn't find any files and I get:

==> R CMD INSTALL --no-multiarch --with-keep.source PACKAGE

* installing to library ‘/Users/noamross/Library/R/3.0/library’
* installing *source* package ‘PACKAGE’ ...
** R
** data
*** moving datasets to lazyload DB
Error in datalist[[1]] : subscript out of bounds

How do I load the data in extdata with a script in data/?

Noam Ross
  • 5,969
  • 5
  • 24
  • 40
  • 1
    Have you tried using `system.file()`? – hadley Jun 24 '14 at 22:03
  • @hadley I have. The problem with `system.file()` is that the package isn't installed in the library prior to roxygenizing or building it, so it doesn't find any files. – Noam Ross Jun 24 '14 at 22:13
  • Side note: `do.call(datalist, rbind)` should be `do.call(rbind, datalist)`. – jbaums Jun 24 '14 at 22:17
  • @jabums. Thanks. I just messed that up when copying over the code and changing variable names to be generic and clear. Doesn't appear to be the source of the problem. Fixed now. – Noam Ross Jun 24 '14 at 22:20
  • 2
    Are you using devtools? It adds a shim so that `system.file()` will work in development mode. – hadley Jun 24 '14 at 23:55
  • I'm not sure I understand the situation but if you want to reach your `extdata` directory you may need to append one of the library paths to `data_locs`: `file.path(.libPaths(), "your-pkg-name", "inst", "extdata")`. – javlacalle Jun 25 '14 at 07:31
  • @hadley I am using devtools (via Rstudio interface), but `system.file` still doesn't work. I think I'm going to move to pre-compiling the datasets into a single .rda file in `data/` on package build, rather than doing so on package load. – Noam Ross Jun 25 '14 at 14:02

0 Answers0