0

I have a model that outputs data in the form of .csv files. The output directory is full of .csv files, each named n.csv where n is the run number. So on the 0th run it creates 0.csv, on the 1st run it creates 1.csv, etc.

Now I want to analyze this data in R, and compare it to the output of another model. I wrote a function that runs the analysis I desire on two models, given as functions as inputs. The model I'm comparing to is a build-in sna function, and to make a function that simulates my model I wrote the following closure

#creates a model function that returns sequentially numbered .csv files from a directory
make.model <- function(dir) {
  i <- -1 # allows for starting the .csv ennumeration at 0
  model <- function() {
    i <<- i + 1
    my.data <- as.matrix(read.csv(paste0(dir, i, ".csv"), header=FALSE))
    return(my.data)
  }
  return(model)
}

The issue I am running into is that although

my.model <- make.model(directory)
spectral.analysis(my.model, other.model, observed.data, nsim = 100)

Does exactly what I want and computes how well my model and the other model do in modeling the observed data, it's not reusable. The counter inside the closure gets permanently up-ticked and so the function can only be run so many times before it tries to access non-existent .csv files.

I am currently getting around this with a "reset" function that redefines my.model and running it after each time I use my.model, this seems like a very poor solution.

Is there a cleverer way to go about doing this? Crucially, the function spectral.analysis() takes functions as its input and then runs the function to obtain it's values, and rewriting that function isn't on the table right now. I'm not passing the data directly from my model to the analysis function because my model takes hours to run and so I want to be able to prerun a lot of trials and analyze them later.

  • 1
    Try i = i+1 instead of i <<- – akaDrHouse Jan 12 '17 at 19:27
  • @akaDrHouse I'm under the impression that using "=" for variable assignment is considered bad style in R, with "<-" preferred. In any case, doing that doesn't work. Using either "<-" or "=" causes the function to look at 0.csv each time it's run, because the incrementation doesn't make it out of the scope of the current run. – Stella Biderman Jan 12 '17 at 19:36
  • 1
    Using global assignment is widely considered *bad practice*. It changes how your code runs and how you architect things. The `<-` vs `=` is a *style preference* which, except for one or two extremely rare cases, doesn't change what your code does, only changes what it looks like. – Gregor Thomas Jan 12 '17 at 20:14
  • @Gregor Except I'm not using global assignment, because it's bound by the closure. There is no variable i in the global enviroment. This precise use of <<- is recommended here: http://stackoverflow.com/questions/2628621/how-do-you-use-scoping-assignment-in-r – Stella Biderman Jan 12 '17 at 20:17
  • Okay, not *global* assignment, but rather *parent frame* assignment. And the problem you have is 100% because you built your code with that paradigm. I'd recommend rewriting either the `model` or `make.model` function can optionally take `i` as an argument. – Gregor Thomas Jan 12 '17 at 20:20
  • @Gregor Can you elaborate on that, preferably in the form of an answer? I don't see how to use that to solve my problem. – Stella Biderman Jan 12 '17 at 20:24
  • Could you use length(list.files(pattern=".csv")) ? That would count the number of csv files there. Then you could just loop through them. – akaDrHouse Jan 12 '17 at 20:24
  • 1
    @akaDrHouse Although your for loop example doesn't suit my needs, length(list.files(pattern=".csv")) does in fact work for this. By replacing `i <<- i + 1` with `i <<- (i + 1) %% length(list.files(dir, pattern=".csv"))` and making a few adjustments (`list.files` requires "C:/Documents" while `read.csv` requires "C:/Documents/") I can make it work! – Stella Biderman Jan 12 '17 at 20:37
  • 1
    I don't think I can make a full answer at this point... it is tricky, and I'm not sure I really understand how you're using it. I'm glad @akaDrHouse's suggestion is helpful; to add on to that if you use `full.names = T` for `list.files` you won't need to do any adjustments to pass it to `read.csv`. – Gregor Thomas Jan 12 '17 at 20:58
  • @Gregor I understand. Thanks for the tip about full.names! – Stella Biderman Jan 12 '17 at 20:59
  • Glad you got it working! – akaDrHouse Jan 12 '17 at 21:13

1 Answers1

0

Self-answering to close, I figured it out with help from the comments.

length(list.files(pattern=".csv"))

Allows you to get the number of .csv files, so changing the line where i is increased to read

i <<- (i + 1) %% length(list.files(pattern=".csv"))

solves the problem.