Questions tagged [drake-r-package]

The drake R package is a Make-like pipeline toolkit. Its purpose is to enhance reproducibility, automation, speed, and scale in R-focused data science workflows. Use this tag for general questions about usage or for help optimizing and debugging drake-powered projects. For bug reports and feature requests, please post to the GitHub issue tracker.

Visit the following to learn more about the drake R package.

85 questions
9
votes
2 answers

Can you have multiple plans using R package drake?

I know it is not best practice to use the R package called drake within a notebook tool, but I'm doing it anyway as a workaround for the limitations to the collaboration infrastructure we have on my team at work. Since my code is broken up into…
Brash Equilibrium
  • 1,357
  • 4
  • 14
  • 35
7
votes
1 answer

subcomponent(mode = "in") for multiple source vertices

Question In the igraph R package, is there an efficient implementation of subcomponent() and/or BFS that can handle multiple source vertices? Motivation The drake R package models a user's workflow as a DAG of interdependent objects and files. The…
landau
  • 5,636
  • 1
  • 22
  • 50
6
votes
0 answers

Lock environment but not .Random.seed

Is it possible to lock the global environment and still allow .Random.seed to be set or removed? The default behavior of lockEnvironment() is too aggressive for my use case. lockEnvironment(globalenv()) rnorm(10) #> Error in rnorm(10) : cannot add…
landau
  • 5,636
  • 1
  • 22
  • 50
4
votes
1 answer

drake - map over ggplot targets to output them

First off, drake is just magical. I love the workflow of designing the dependency graph, and then executing it in one fell swoop. However, I ran into a roadblock. My workflow is simulating over large parameter grids, and then summarizing different…
ivan-k
  • 811
  • 1
  • 7
  • 20
3
votes
1 answer

Best practices for unit tests on custom functions for a drake workflow

A drake workflow can have several custom functions stored in its R directory. It would make sense to develop unit tests for the custom functions. There is well-established tooling and structures for running testthat unit tests on an R package in…
3
votes
0 answers

Control the environment of nested calls to source()

In this new set of features, I am trying to steer drake away from the user's global environment. This is challenging because users can define arbitrarily nested code files. Let's say a user defines files packages.R, functions.R, and master.R as…
landau
  • 5,636
  • 1
  • 22
  • 50
2
votes
1 answer

Using the import package with drake

Finding out about the drake package was one of the best recent discoveries as an R user. However, one drawback I see with the package in terms of reproducibility is the cluttering of the workspace with functions that are merely helper functions. No…
telegott
  • 196
  • 1
  • 10
2
votes
3 answers

What is the best practice for transferring objects across R projects?

I would like to use R objects (e.g., cleaned data) generated in one git-versioned R project in another git-versioned R project. Specifically, I have multiple git-versioned R projects (that hold drake plans) that do various things for my thesis…
shir
  • 51
  • 6
2
votes
2 answers

drake plan fitting lmer models fails

I am trying to fit some lme4::lmer models in drake plan, but am getting an error 'data' not found, and some variables missing from formula environment If I substitute an lm model, it works. Here is a reproducible…
Richard Telford
  • 9,558
  • 6
  • 38
  • 51
2
votes
1 answer

Use a dedicated environment for drake, with r_make()

I'm trying to adapt the recommendation in Section 12.7.6.5 of the manual of using a dedicated environment (rather than the global environment) to interactive usage with r_make(). What I did is to modify the _drake.R configuration script as…
ƒacu.-
  • 507
  • 3
  • 9
2
votes
1 answer

Create groups of targets

Let's say that I have the following plan: test_plan = drake_plan( foo = target(x + 1, transform = map(x = c(5, 10))), bar = 42 ) Now I want to create a new target that contains the two subtargets foo_5, foo_10 and the target bar. How can I…
lordbitin
  • 185
  • 1
  • 9
2
votes
1 answer

Remove unused and old targets from drake cache

Over time, I have a lot of older targets in my drake cache (current==FALSE under drake_history()). I've renamed many of my targets over time, so I'm left with targets in drake_history() which are current==TRUE, however they're not in my current…
Rahul
  • 2,579
  • 1
  • 13
  • 22
2
votes
2 answers

Generate chain or sequence of steps without naming them

I'd like to use drake to audit a series of validation and cleaning steps for a dataframe. I think there will be many functions that form a chain, where a dataframe will be passed in, a validation will happen, or a cleaning will happen, and the…
mpettis
  • 3,222
  • 4
  • 28
  • 35
2
votes
2 answers

`r_make()` and `make()` not consistent in r-drake

I'm using RStudio for the work in question. I used to use drake::make() ignoring the prompt to use r_make() till yesterday when I decided to give it a try. Now, I'm in a bit of a pickle. Not sure what I've done / if I have found a bug. My project…
Rahul
  • 2,579
  • 1
  • 13
  • 22
2
votes
2 answers

Faster way to slice a (raw) vector?

Problem I am looking for a fast (ideally constant-time) way to take a large slice a long raw vector in R. For example: obj <- raw(2^32) obj[seq_len(2^31 - 1)] Even with ALTREP, base R takes too long. system.time(obj[seq_len(2^31 - 1)]) #> user …
landau
  • 5,636
  • 1
  • 22
  • 50
1
2 3 4 5 6