5

I am a Stata user trying to learn R.

I have a couple of lengthy folder paths which, in my Stata code, I stored as local macros. I have multiple files in both those folders to use in my analysis.

I know, in R, I can change the working directory each time I want to refer to a file in one of the folders but it is definitely not a good way to do it. Even if I store the folder paths as strings in R, I can't figure out how to refer to those. For example, in Stata I would use `folder1'.

I am wondering if trying to re-write Stata code line by line in R is not the best way to learn R.

Can someone please help?

drhagen
  • 8,331
  • 8
  • 53
  • 82
user2012406
  • 147
  • 3
  • 9
  • I think you're looking for `list.files(.)`. Look [**here**](http://stackoverflow.com/questions/5758084/loop-in-r-loading-files/5758134#5758134). Also check `?list.files` for all possible options. – Arun Mar 28 '13 at 20:20
  • 1
    "Even if I store the folder paths as strings in R, I can't figure out how to refer to those (like using `folder1' in Stata)." Can you give a concrete example of this problem, with code? – joran Mar 28 '13 at 20:25
  • @joran folder1 is the name of the variable. Surrounding it with backtick/tick resolves the name and returns the value. Thinking about Stata is going to give me nightmares... – Joshua Ulrich Mar 28 '13 at 20:30
  • @joran folder1 is the name of the local. An example is `local folder1 "Z:/Project/Data/Raw"`. Suppose this folder Raw has a bunch of datasets I need to use, each time I want to load the dataset, I don't want to repeat "Z:/Project/Data/Raw". Instead, in Stata I stored it as a local and say `use "`folder1'/file1.dta"` – user2012406 Mar 28 '13 at 21:35
  • 1
    I think the short answer is that there is no one-to-one equivalent of Stata's local macros in R, so you need to learn how to do things differently, and in fact more directly. – Nick Cox Mar 28 '13 at 22:51

3 Answers3

7

Maybe you want file.path()?

a <- "c:"
b <- "users"
c <- "charles"
d <- "desktop"

setwd(file.path(a,b,c,d))
getwd()
#----
[1] "c:/users/charles/desktop"

You can wrap source or read.XXX or whatever else around that to do what you want.

Chase
  • 67,710
  • 18
  • 144
  • 161
  • @user2012406 I'm glad you got an answer that solved your problem! It helps improve the quality of the site if you indicate this by clicking the check mark by the answer that solved your problem. (You are never under any obligation to do so, but it helps signal to others which answer actually solved your problem.) – joran Mar 28 '13 at 22:32
  • @joran Sorry about that. I joined Stack Overflow just a couple of days ago. I have been using it on and off when it popped up in my search results when working on stata but I never had an account until now. I still don't have enough reputation to upvote or downvote anything. I will remember to revisit these answers to upvote when I get the reputation needed. – user2012406 Apr 01 '13 at 20:37
  • @user2012406 You don't need any rep to click on the check mark to indicate which answer solved your problem. In fact, doing so will earn you some rep! – joran Apr 01 '13 at 20:49
  • @joran I checked the most appropriate answer. You are right. It improved my reputation, although I still don't have enough to vote up or down anything. – user2012406 Apr 02 '13 at 21:07
  • @user2012406 Now you do! :) – joran Apr 02 '13 at 22:21
6

First, as a former Stata user, let me recommend R for Stata Users. There is also this article on Macros in R. I think @Nick Cox is right that you need to learn to do things more differently. But like you (at least in this case), I often find myself starting a new task with my prior knowledge of how to do it in Stata and going from there. Sometimes I find the approaches are similar. Sometimes I can make R act like Stata when a different approach would be better (e.g., loops vs. vectorization).

I'm not sure if I will capture your question with the following, but let me try.

In Stata, it would be common to write:

global mydata "path to my data directory/"

To import the data, I would just type:

insheet using "${mydata}myfile.csv"

As a former Stata user, I want to do something similar in R. Here is what I do:

mydata <- "path to my data directory/"

To import a csv file located in this directory and create a data frame called myfile, I would use:

myfile <- read.csv(paste(mydata, "myfile.csv", sep=""))

or more efficiently...

myfile <- read.csv(paste0(mydata, "myfile.csv"))

I'm not a very efficient R user yet, so maybe others will see some flaws in this approach.

Eric Green
  • 7,385
  • 11
  • 56
  • 102
  • 1
    Stata calls retrieval of a named string (i.e. character vector) a 'macro'? – IRTFM Mar 28 '13 at 23:29
  • 1
    It is an example of a use of a global macro. see [here](http://data.princeton.edu/stata/programming.aspx). There are many more interesting uses. – Eric Green Mar 28 '13 at 23:53
  • 1
    Ah. My impression is that R uses list structures more than Stata. There is a function which I have mainly seen used in lattice graphics called `modifyList` which might allow similar uses. There are also expressions and the `substitute` function in the language manipulation domain that might be needed to get something like that functionality. It appears that the Stata presume it will get ordered text arguments without as many separators while R has a greater degree of separation of character vectors from actual language elements. – IRTFM Mar 29 '13 at 01:49
1

I'm guessing from context that the term "local" when applied to files means that they have been loaded into memory for efficiency purposes? If so, then you need to realize that pretty much all ordinary R objects are handled that way. See ?read.table and ?load. The only way data can remain non-local is to have it reside in a database that has an interface package that supports SQL queries or use specialized packages such as ff or bycol.

Other than that and Chase's idea to use file.path(), any reference to files or connections is done using the proper read/load/scan functions to which character values are given as (variously named) arguments. You can see a variety of low-level functions with ?file and perhaps following some of the additional links from that help page. You could store one or more results of a file.path construction in a character vector which could be named for easy reference.

 pathvecs <- c(User= "~/", hrtg="~/Documents/Heritage/")
 pathvecs
#                   User                    hrtg 
#                   "~/" "~/Documents/Heritage/" 
pathvecs["hrtg"]
#                   hrtg 
#"~/Documents/Heritage/" 
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • By local, I mean the macro local in Stata. I am sorry but I guess I wasn't clear with my question. I know how to load the data. I am figuring out how to avoid repeating lengthy file paths by storing them as a small "local" and using the local-name instead. – user2012406 Mar 28 '13 at 21:38
  • That does not help me understand. I cannot figure out why a named character vector is not an effective way to document and store paths. – IRTFM Mar 28 '13 at 21:41