20

How can a sourced or Sweaved file find out its own path?

Background:

I work a lot with .R scripts or .Rnw files. My projects are organized in a directory structure, but the path of the project's base directory frequently varies between different computers (e.g. because I just do parts of data analysis for someone else, and their directory structure is different from mine: I have projects base directories ~/Projects/StudentName/ or ~/Projects/Studentname/Projectname and most students who have just their one Project usually have it under ~/Measurements/ or ~/DataAnalysis/ or something the like - which wouldn't work for me).

So a line like

    setwd (my.own.path ()) 

would be incredibly useful as it would allow to ensure the working directory is the base path of the project regardless of where that project actually is. Without the need that the user must think of setting the working directory.

Let me clarify: I look for a solution that works with pressing the editor's/IDE's source or Sweave Keyboard shortcut of the unthinking user.

cbeleites unhappy with SX
  • 13,717
  • 5
  • 45
  • 57
  • 2
    The option `chdir` of `source` seems similar to what you are asking. – James Jan 12 '12 at 13:07
  • 1
    My typical arrangement for this is to put each computer's project directory in a character variable set in .Rprofile. – Ari B. Friedman Jan 12 '12 at 13:11
  • @James: That's a partial solution. If I have to type in `source ("project/path/file.R", chdir = TRUE)` I'm not better off than typing `setwd ("project/path") and hit the shortcut for interaction (most users will start with a script, possibly `source` it and then go on more or less interactively while tweaking/adding to the script or Sweave file. – cbeleites unhappy with SX Jan 12 '12 at 13:55
  • @gsk3: I keep that in mind, but it needs care when syncing between computers. – cbeleites unhappy with SX Jan 12 '12 at 13:56
  • 1
    @cbeleites I'm not sure it can be done from within the script. If its interactive file selection thats important, then something like `source(.scriptLoc<-file.choose()); setwd(dirname(.scriptLoc))` – James Jan 12 '12 at 14:38
  • Possible duplicate http://stackoverflow.com/q/13672720/1247080 – Stat-R Sep 25 '14 at 18:02

6 Answers6

13

Just FYI, knitr will setwd() to the dir of the input file when (and only when) evaluating the code chunks, i.e. if you call knit('path/to/input.Rnw'), the working dir will be temporarily switched to path/to/. If you want to know the input dir in code chunks, currently you can call an unexported function knitr:::input_dir() (I may export it in the future).

Yihui Xie
  • 28,913
  • 23
  • 193
  • 419
  • Dear Yihui, good to know - I wasn't aware of `knitr` and it looks very promising. – cbeleites unhappy with SX Jan 16 '12 at 10:43
  • It's great that knitr has this smart default behaviour. However, `knitr:::input_dir()` defaults to "." even when a knitr chunk is executed in Rstudio. My workflow is building the script in Rstudio and only knitting at the end – so `knitr:::input_dir()` isn't sufficient. – Ruben Jul 30 '13 at 16:49
  • @Ruben I think that is a misunderstanding due to the fact that you were using RStudio: RStudio does `setwd()` (sets working directory to the dir of your input file) before calling `knitr`, and that is why you see `.`, which is absolutely correct. You should test it outside of RStudio, e.g. run in an R terminal `library(knitr); knit('path/to/input.Rnw')`, then you will see `input_dir()` is `path/to/` – Yihui Xie Jul 30 '13 at 19:12
  • @Yihui I'm trying to find a file's directory, even in interactive sessions. Rstudio defaults to the file's directory in some cases, but if you have many tabs open etc. you can get a situation where you'd like to reset to the source file location. `knitr:::input_dir()` helps with this only if you knit the document, not when you execute code chunks interactively. I see `.` because ` `knitr:::.knitEnv$input.dir` is `NULL` in interactive sessions, not because `.`is the current directory (if I alter the wd before calling `input_dir()` nothing changes). So, it's due to interactive sessions, right? – Ruben Jul 30 '13 at 21:48
  • 2
    @Ruben That is right. RStudio launches a new non-interactive R session to compile your documents, which is independent of your current interactive R session. `knitr:::input_dir()` only works _inside_ a document (isn't that sufficient?); once `knitr` quits, it is gone. – Yihui Xie Jul 30 '13 at 21:52
  • 2
    @Yihui I think that it may not be possible to get a file's directory from within R in an interactive session without recourse to the IDE. For my workflow, this wouldn't be a big problem, but if I want to send/leave files to co-workers, they should be unable to mess it up, if they just open files in a directory and execute all code. Unfortunately, with the way working directories persist sometimes, they can and will. My co-workers are not at the "use R scripts to open other R scripts" level - I'm happy if they save their scripts instead of relying on it to be restored. SPSS habits. – Ruben Jul 30 '13 at 21:52
  • 1
    @Ruben you can write a custom script (be it a *nix shell script or a Windows .bat file) to compile the documents if they do not use RStudio and ask them to click it; well, if they are able to compile the document without RStudio, I'm pretty sure they must be smart enough to understand `setwd()`. You can put some checking procedures to stop knitr is you think the working directory is wrong, e.g. `stopifnot(file.exists('your_file.Rnw'))` – Yihui Xie Jul 30 '13 at 22:12
  • @Yihui The stopifnot snippet at least works with a constant filename, across platforms, but it doesn't add much value over an error further down the line. Shell scripts or bat files are an option, but if I want to sacrifice portability across platforms, I can just hard-code the paths :-). Oh well, will have to write a small manual. – Ruben Jul 31 '13 at 07:16
  • 1
    @Ruben I guess an IDE or a smart enough editor is a better way to go than being trapped into these endless gory details. Click a button or press a key to get things done. No questions asked (TM). :) – Yihui Xie Jul 31 '13 at 07:34
11

Starting from gsk3's Seb's suggestions, here's an idea:

  • the combination of username (login) and IP or name of the computer could be used to select the right directory.

That leads to something like:

    setwd (switch (paste (Sys.info () [c ("user", "nodename")], collapse="."), 
           user.laptop  = "~/Messungen",
           user2.server = "~/Projekte/Projekt/",
           ))

So there is an automatic solution, that

  • works with source
  • works with Sweave
  • even works for interactive sessions where the commands are sent line by line

  • the combination of user and nodename of course needs to be specific

  • the paths need to be edited by hand, though.

Improvements welcome!


Update:

Gabor Grothendieck answered the following to a related question on r-help today:

this.dir <- dirname(parent.frame(2)$ofile)
setwd(this.dir)

which will work for source.


Another update: I now do most of the data analysis work in RStudio. RStudio's projects basically solve the problem: RStudio changes the working directory to the project root directory every time I switch between projects.

I can therefore put the project directory as far down my directory tree as I want (and the students can also put their copy wherever they want) and sync the data files and scripts/.Rnws via version control (We use a private git server). The RStudio project files are kept out of the version control, i.e. .gitignore contains .Rproj.user.

Obviously, within the project, the directory structure needs to be synchronized.

cbeleites unhappy with SX
  • 13,717
  • 5
  • 45
  • 57
  • For the moment I'll stick with this, as it allows the users easily (i.e. without depending on a `setwd` feature of their editor) to use the files interactively while they are developed. Thanks for the input that allowed me to have this idea. – cbeleites unhappy with SX Jan 16 '12 at 20:31
  • 1
    The first solution doesn't work in an arbitrary environment, the Grothendieck's solution doesn't work on my mac (parent.frame() has no $ofile). – Ruben Jul 30 '13 at 16:40
  • @cbeleites In Rstudio on a windows machine, this is the error message I get "Error in dirname(parent.frame(2)$ofile) : a character vector argument expected" – WetlabStudent Dec 29 '14 at 17:50
  • @MHH: With RStudio I don't have the problem any more: I use projects which set the appropriate working directory when switching. I sync files with the students via git but ignore the RStudio project files. – cbeleites unhappy with SX Dec 29 '14 at 18:03
  • @cbeleites that is a really good idea. Would it be possible to provide a detailed description of how you do this (using links wherever you want to save yourself some writing). That seems like the most practical answer to the initial question - would be very useful as an answer here I think. – WetlabStudent Dec 29 '14 at 18:09
  • Note that, if it's within a function, you'd need to use `parent.frame(3)$ofile`. It's only `2` if it's being executed at top-level in a script. – Kevin Nov 24 '15 at 05:09
4

You can use sys.calls() to get the command used to source the file. Then you need a bit of trickery using regular expressions to get the pathname, bearing in mind that source("something/filename") could have used either the absolute or relative path. Here's a first attempt at putting all the pieces together: try inserting the following lines at the top of a source file.

whereFrom=sys.calls()[[1]]
# This should be an expression that looks something like
# source("pathname/myfilename.R")
whereFrom=as.character(whereFrom[2]) # get the pathname/filename
whereFrom=paste(getwd(),whereFrom,sep="/") # prefix it with the current working directory
pathnameIndex=gregexpr(".*/",whereFrom) # we want the string up to the final '/'
pathnameLength=attr(pathnameIndex[[1]],"match.length")
whereFrom=substr(whereFrom,1,pathnameLength-1)
print(whereFrom) # or "setwd(whereFrom)" to set the working directory

It's not very robust—for instance, it will fail on windows with source("pathname\\filename"), and I haven't tested what happens if you have one file sourcing another file—but you might be able to build a solution on top of this.

Alexander Hanysz
  • 791
  • 5
  • 15
3

I have no direct solution how to obtain the directory of the file itself but if you have a limited range of directories and directory structures you can probably use

 if(file.exists("c:/somedir")==TRUE){setwd("c:/somedir")}

You could check out the pattern of the directory in question and then set the dir. Does this help you?

Seb
  • 5,417
  • 7
  • 31
  • 50
  • Seb, unfortunately the directory names different persons prefer seem to be very different... – cbeleites unhappy with SX Jan 12 '12 at 13:57
  • Could search the whole directory tree once, then write the resultant directory to a constant stored in .Rprofile? :-) – Ari B. Friedman Jan 12 '12 at 19:05
  • The time needed to search the whole directory tree should teach anyone to organize himself with a Projects directory ;-) – cbeleites unhappy with SX Jan 12 '12 at 20:14
  • `==TRUE` is redundant in `if(file.exists("c:/somedir")==TRUE)` – Yihui Xie Jan 13 '12 at 15:23
  • fiy, `if(var==TRUE)` is not the same as `if(var)`. `if(var)` means `if(var!=FALSE)` or `if(var!=0)`, as per C tradition. `if(var==TRUE)` fails if var is not exactly `TRUE`, while `if(var)` only fails if `var` is exactly `0` or `FALSE`. As such, `if(123==TRUE)` fails and `if(123)` doesn't. – Jules G.M. Sep 22 '14 at 01:05
2

An additional problem is that the working directory is a global variable, which can be changed by any script, so if your script calls another script, it will have to set the wd back. In RStudio I use Session -> Set Working Directory -> To Source File Location (I know, it's not ideal), and then my script does

wd = getwd ()
...
source ("mySubDir/myOtherScript.R", chdir=TRUE); setwd (wd)
...
source ("anotherSubDir/anotherScript.R", chdir=TRUE); setwd (wd)

In this way one can maintain a stack of working directories. I would love to see this implemented in the language itself.

1

This answer works for source and also inside nvim-R - I have no idea if it works with knitr and similar things. Any feedback appreciated.

If you have multiple scripts source-ing each other, it is important to get the correct one. That is, the largest i for which sys.frame(i)$ofile exists.

get.full.path.to.this.sourced.script = function() {    
    for(i in sys.nframe():1) {  # Go through all the call frames,
                                # in *reverse* order.
        x = sys.frame(i)$ofile
        if(!is.null(x))               # if $ofile exists,
            return(normalizePath(x))  #  then return the full absolute path
    }
}
Aaron McDaid
  • 26,501
  • 9
  • 66
  • 88