4

I am trying to make my current project reproducible, and so am creating a master document (eventually a .rmd file) that will be used to call and execute several other documents. This way myself and other investigators only need to open and run one file.

There are three layers to the current setup: master file, 2 read-in files, 2 databases. The master file calls the read-in files using source(), and the read-in files parse the .csv databases and apply labels.

The read-in files and the databases are generated automatically with the data management software I'm currently using (REDCap) each time I download the updated data.

However, the read-in files have a line of code that removes all of the objects in my environment. I would like to edit the read-in files directly from the master file so that I do not have to open the read-in files individually each time I run my report. Specifically, since all the read-in files are the same, I would like to remove line #2 in each.

I've tried searching Google, and tried file.edit(), but have been unable to find anything. Not even sure it is possible, but figured I would ask. Let me know if I can improve this question or if you need any additional code to answer it. Thanks!

Current relevant master code (edited for generality):

 source("read-in1")  
 source("read-in2")

Current relevant read-in file code (same in each file, except for the database name):

 #Clear existing data and graphics  
 rm(list=ls())  
 graphics.off()  
 #Load Hmisc library  
 library(Hmisc)  
 #Read Data  
 data=read.csv('database.csv')  
 #Setting Labels  

[read-in code truncated]

Additional details:
OS: Windows 7 Professional x86
R version: 3.1.3
R Studio version: 0.99.441

wibeasley
  • 5,000
  • 3
  • 34
  • 62
  • This would probably be good time to point out that this is a good reason not to have `rm(list=ls())` in a script file. I'm not sure what you are trying to prevent there. You can source files in to an environment other than the default environment if you really wanted via `local=` or `sys.source()` – MrFlick Jun 17 '15 at 00:51
  • I definitely agree, but in this case it sounds like this is REDCap's fault in their default data-through-script export process... It sounds as though the OP doesn't have any control over this, though may be able to export the data via some other format and ingest the data directly to R rather than by R script. I don't use it so hard to say? – Forrest R. Stevens Jun 17 '15 at 02:59
  • That's correct, Forrest. I can either export it "for use in R", which generates the afore-mentioned read-in files, or just export it as a csv and write the read-in file myself. The problem is I have nearly 2,000 variables so that's not really feasible. Hence my conundrum. – Carolyn W Clayton Jun 17 '15 at 04:27

3 Answers3

4

You might try readLines() and something like the following (which was simplified greatly by a suggestion from @Hong Ooi below):

eval(parse(readLines("read-in1.R")[-2]))

My original solution which was much more pedantic:

f <- file("read-in1.R", open="r")
t <- readLines(f)
close(f)

for (l in t[-2]) { eval(parse(text=l)) }

The for() loop just parses and evaluates each line from the text file except for the second one (that's what the -2 index value does). If you're reading and writing longer files then the following will be much faster than the second option, however still less preferable than @Hong Ooi's:

f <- file("read-in1.R", open="r")
t <- readLines(f)
close(f)

f <- file("out.R", open="w")
o <- writeLines(t[-2], f)
close(f)
source("out.R")
Forrest R. Stevens
  • 3,435
  • 13
  • 21
  • This worked, thanks! Although I have not yet figured out how to test the timings, I think it might run a bit slower than LegalizeIt's answer since it for-loops through thousands of lines of code, but it is a simple and readable solution, and I will keep it in mind and will likely implement it in the future. Thanks so much for your help! – Carolyn W Clayton Jun 17 '15 at 04:43
  • I didn't realize speed was an issue here, but if that's so then you should use my second option above. – Forrest R. Stevens Jun 17 '15 at 05:17
  • 2
    Note that you don't actually need an explicit for loop or a separate write. This works as well: `eval(parse(readLines("read-in1.R")[-2]))` – Hong Ooi Jun 17 '15 at 05:31
  • Ah, you're right @Hong Ooi, that's much simpler. I didn't realize that `readLines()` would accept a character string and interpret it as a file name. I appreciate the clarification and the pointer! – Forrest R. Stevens Jun 17 '15 at 05:34
  • Wow, both of those are so elegant. Thank you! – Carolyn W Clayton Jun 17 '15 at 15:55
2

Sorry I'm so late in noticing this question, but you may want to investigate getting access the the REDCap API and using either the redcapAPI package or the REDCapR package. Both of those packages will allow you to export the data from REDCap and directly into R without having to use the download scripts. redcapAPI will even apply all the formats and dates (REDCapR might do this now too. It was in the plan, but I haven't used it in a while).

Benjamin
  • 16,897
  • 6
  • 45
  • 65
1

You could try this. It just calls some shell commands: (1) renames the file, then (2) copies all lines not containing rm(list=ls()) to a new file with the same name as the original file, then (3) removes the copy.

files_to_change <- c("read-in1.R", "read-in2.R")
for (f in files_to_change) {
    old <- paste0(f, ".old")
    system(paste("cmd.exe /c ren", f, old))
    system(paste("cmd.exe /c findstr /v rm(list=ls())", old, ">", f))
    system(paste("cmd.exe /c rm", old))
}

After calling this loop you should have

#Clear existing data and graphics  
graphics.off()  
#Load Hmisc library  
library(Hmisc)  
#Read Data  
data=read.csv('database.csv')  
#Setting Labels  

in your read-in*.R files. You could put this in a batch script

@echo off
ren "%~f1" "%~nx1.old"
findstr /v "rm(list=ls())" "%~f1.old" > "%~f1"
rm "%~nx1.old"

say, "example.bat", and call that in the same way using system.

Rorschach
  • 31,301
  • 5
  • 78
  • 129