Automate analysis over multiple .txt files

Question

I have many copies of two types (a + b) of txt file i.e:

a1.txt a2.txt a3.txt... and b1.txt b2.txt b3.txt

My aim is to run an r script that does the following:

read.table a1.txt
#run a bunch of code that chops and changes the data and then stores some vectors and data      frames.
w<-results
x<-results
detach a1.txt
read.table b1 .txt 
#run a bunch of code that chops and changes the data and then stores some vectors and data frames.
y<-results
z<-results
model1<-lm(w~y)
model2<-lm(x~z)

Each time I want to extract coefficients from e.g. 1 slopes for model1 and 2 slopes from model2. I want to run this analysis in an automated way across all pairs of a and b text files and build up the coefficients in vector format in one other file. for later analysys.

I so far have only been able to get bits and bobs from more simple analyses like this. Does anyone have the best idea on how to run this more complex iteration over many files?

EDIT: Tried so far but failed as yet:

your<-function(x) 
{
files <- list.files(pattern=paste('.', x, '\\.txt', sep=''))
a <- read.table(files[1],header=FALSE)
attach(a)
w <- V1-V2
detach(a)
b <- read.table(files[2],header=FALSE)
z <- V1-V2
model <- lm(w~z)
detach(b)
return(model$coefficients[2])
}

slopes <- lapply(1:2, your)
Error in your(1) : object 'V1' not found

2 things. First, its best to avoid using `attach` and `detach`. Second, (would have been avoided by addressing the first!) after you read `b` you don't `attach` it. Instead use something like `z <- b$V1 - b$V2`. — Justin, Apr 12 '12 at 21:43
I just got that cracked before you got back to me. Thanks very much! — user1322296, Apr 12 '12 at 21:48

score 3 · Accepted Answer · edited Apr 13 '12 at 06:17

3

You can do something like:

files <- list.files(pattern='.1\\.txt') # get a1.txt and b1.txt

if you know how many files you have (lets say 10), you would wrap your code above in a function and use one of the apply family depending on your desired output:

your.function(x) {
  files <- list.files(pattern=paste('.', x, '\\.txt', sep=''))
  a <- read.table(files[1])
  b <- read.table(files[2])

  w <- ...
  x <- ...

  y <- ...
  z <- ...

  model1 <- lm(w~y)
  model2 <- lm(x~z)

  return(c(model1$coefficients[2], moedl2$coefficients[2]))
}

slopes <- lapply(1:10, your.function)

edited Apr 13 '12 at 06:17

Paul Hiemstra

59,984
12
142
149

answered Apr 12 '12 at 19:26

Justin

42,475
9
93
111

Hi justin, thanks for your answer, could you possibly clarify the `list.files(pattern=paste('.', x, '\\.txt', sep=''))` section, im unsure about what the `('.', x, '\\.txt', sep='')` arguments take – user1322296 Apr 12 '12 at 20:22
`paste` takes an arbitrary number of arguments and smooshes them together with the specified separator (`sep=''` gives no separator). e.g. `paste('foo', 'bar', sep=' ')` versus `paste('foo', 'bar', sep='!')`. Does that help? I'm just mashing the index (the number 1 through 10) onto a regular expression that matches a single character (a or b) plus number.txt (so it should find a1.txt and b1.txt if x=1). Clear as mud I'm sure! – Justin Apr 12 '12 at 20:27

Automate analysis over multiple .txt files

1 Answers1