4

Summary: I am trying to define a dvc step using dvc-run where the command depends on some environment variables (for instance $HOME). The problem is that when I'm defining the step on machine A, then the variable is expanded when stored in the .dvc file. In this case, it won't be possible to reproduce the step on machine B. Did I hit a limitation of dvc? If that's not the case, what's the right approach?

More details: I faced the issue when trying to define a step for which the command is a docker run. Say that:

  • on machine A myrepo is located at /Users/user/myrepo and
  • on machine B it is to be found at /home/ubuntu/myrepo.

Furthermore, assume I have a script myrepo/script.R which processes a data file to be found at myrepo/data/mydata.txt. Lastly, assume that my step's command is something like:

docker run -v $HOME/myrepo/:/prj/ my_docker_image /prj/script.R /prj/data/mydata.txt

If I'm running dvc run -f step.dvc -d ... -d ... [cmd] where cmd is the docker execution above, then in step.dvc the environment variable $HOME will be expanded. In this case, the step will be broken on machine B.

Dror
  • 12,174
  • 21
  • 90
  • 160
  • 2
    Hi! Try covering your command in single-quotes, so that your shell doesn't expand env vars right away. E.g. `dvc run -f step.dvc -d ... -d ... 'docker run -v $HOME/myrepo/:/prj/ my_docker_image /prj/script.R /prj/data/mydata.txt'. – Ruslan Kuprieiev Jul 21 '19 at 12:18
  • Thanks! This seems to be the solution. I was close... I tried `"` :) – Dror Jul 22 '19 at 06:10
  • @RuslanKuprieiev I now realized what confused me... [`run.md#L29-L33`](https://github.com/iterative/dvc.org/blame/master/static/docs/commands-reference/run.md#L29-L33). Shouldn't `"` be replaced with `'` in the docs? – Dror Jul 23 '19 at 05:41
  • 2
    Good point! Created https://github.com/iterative/dvc.org/pull/498 for that. – Ruslan Kuprieiev Jul 23 '19 at 08:49
  • 1
    @RuslanKuprieiev can you create an answer so that we can close this? :) – Shcheklein Jul 23 '19 at 16:48
  • 1
    It's kind of more of a question about Shell than DVC, but probably a common concern for `dvc run` also. E.g. see https://stackoverflow.com/questions/840536/how-to-use-an-environment-variable-inside-a-quoted-string-in-bash – Jorge Orpinel Pérez Sep 19 '19 at 18:25

1 Answers1

2

From docs:

Use single quotes ' instead of " to wrap the command if there are environment variables in it, that you want to be evaluated dynamically. E.g. dvc run -d script.sh './myscript.sh $MYENVVAR'

Ruslan Kuprieiev
  • 845
  • 8
  • 11