0

I am trying to manage multiple files in R but am having a difficult time of it. I want to take the data in each of these files and manipulate them through a series of steps (all files receiving the same treatment). I think that I am going about it in a very silly manner though. Is there a way to manage many files (each the same a before) without using 900 apply statements? For example, when is it recommended you merge all the data frames rather that treat each separately? Is there a way to merge more than two, or an uncertain number, as with the way the files are input here? Or is there a better way to handle so many files?

I take files in a standard way:

chosen<-(tk_choose.files(default="", caption="Files:", multi=TRUE, filters=NULL, index=1))

But after that I would like to do several things with the data. As of now I am just apply different things but it is getting confusing. See:

ytrim<-lapply(chosen, function(x) strtrim(y, width=11))
chRead<-lapply(chosen,read.table,header=TRUE)
tmp<-lapply(inputFiles, function(x) stack(fnctn))

etc, etc. This surely can't be the recommended way to go about it. Is there a better way to handle a multitude of files?

Stephopolis
  • 1,765
  • 9
  • 36
  • 65
  • 1
    Please don't mind me asking, but you have a whole lot of unbound variables in your example, e.g. `y` and `fnctn`. However, for starters, you could either clean up by subsuming the various steps into one single function, or you could use `Reduce` to accumulate the result while iterating over the list. – fotNelton Aug 21 '12 at 20:00
  • I am not 100% on what you mean by unbound, BUT if you mean that you don't see where they are declared then that is because I didn't include all of my code, for sanities sake. I just wanted to show that the way I am trying to handle it is bulky and highly repetitive. As for Reduce, I will go have a look! Thanks for the suggestion! – Stephopolis Aug 21 '12 at 20:02
  • I'm not sure what bothers you about this. Does it work? If so, then don't fix something that isn't broken. Is it slow (and there is no reason why it should be), then give details about the issue, and we can help. Otherwise, try to be a bit more specific about the question. – Andrie Aug 21 '12 at 21:05

1 Answers1

1

You can write one function with all operations, and apply it to all your files like this:

doSomethingWithFile <- function(filename) {
    ytrim <- strtrim(filename, width=11))
    chRead<- read.table(filename,header=TRUE)
    # Return some result
    chRead
}

result<-lapply(chosen, doSomethingWithFile)

You will only need to think about how to return the results, as lapply needs to return a list with the same length as the input (chosen, in this case). You could also look at one of the apply functions of the plyr packages for more flexibility.

(BTW: this code is not without errors, but neither is your example... I'll update mine if you give a proper example)

Community
  • 1
  • 1
ROLO
  • 4,183
  • 25
  • 41
  • At this point I feel more concerned about the over all concept, rather than a specific example. I was just wondering if it made sense to continue applying at different times or if there is a kind of standard practice for taking in many files. – Stephopolis Aug 21 '12 at 20:08