0

I have hundreds of text files with the following information in each file:

*****Auto-Correlation Results******
1     .09    -.19     .18     non-Significant

*****STATISTICS FOR MANN-KENDELL TEST******
S=  609
VAR(S)=      162409.70
Z=           1.51
Random : No trend at 95%

*****SENs STATISTICS ******
SEN SLOPE =  .24

I am reading them using this code. I want to read all these files, and "collect" Sen's Statistics from each file (eg. .24) and compile into one file along with the corresponding file names.

require(gtools)
GG <- grep("*.txt", list.files(), value = TRUE)
GG<-mixedsort(GG)
S <- sapply(seq(GG), function(i){
    X <- readLines(GG[i])
    grep("SEN SLOPE", X, value = TRUE)
    })
spl <- unlist(strsplit(S, ".*[^(-|\\s).0-9]"))
SenStat <- as.numeric(spl[nzchar(spl)])
SenStat<-data.frame( SenStat,file = GG)
write.table(SenStat, "sen.csv",sep = ", ",row.names = FALSE)

But now I am getting this error:

Error in strsplit(S, ".*[^(-|\\s).0-9]") : non-character argument

Could anyone please help? Thanks.

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
Geekuna Matata
  • 1,349
  • 5
  • 19
  • 38
  • 1
    S could be a factor, try wrapping it like this `as.character(S)` in your `strsplit(S, ".*[^(-|\\s).0-9]")` call. Another alternative: if the Sen Slope is always on the same line, you could just read in the file and extract that line (row number) – infominer Apr 23 '14 at 03:08
  • 1
    You changed the code a lot from [the last time I answered this question](http://stackoverflow.com/questions/23038367/how-do-i-read-information-from-text-files/23038389?noredirect=1#comment35547561_23038389). I even just checked it for you. What gives? – Rich Scriven Apr 23 '14 at 03:10
  • Hi Richard, isn't it the same code? I am not able to tell the difference. I have only added Mixed sort. Could you please help? :( – Geekuna Matata Apr 23 '14 at 04:25
  • Copy and paste the code after "TO LOOP OVER MULTIPLE FILES" on the other post. `mixedsort` is not in my answer. – Rich Scriven Apr 23 '14 at 04:46
  • This line: `grep("SEN SLOPE", X, value = TRUE)` does nothing. it's value is not assigned and will be garbage collected. – IRTFM Apr 23 '14 at 06:33
  • @BondedDust, that line is inside the call to `sapply` – Rich Scriven Apr 23 '14 at 09:34

1 Answers1

0

Isn't this a duplicate to this bounty question?? Anyways: Reposting my answer here!

Step1: Save complete fileNames in a single variable:

fileNames <- dir(dataDir,full.names=TRUE)

Step2: Lets read and process one of the files, and ensure that it is giving correct results:

data.frame(
  file=basename(fileNames[1]), 
  SEN.SLOPE= as.numeric(tail(
    strsplit(grep('SEN SLOPE',readLines(fileNames[1]),value=T),"=")[[1]],1))
  )

Step3: Do this on all the fileNames

do.call(
  rbind,
  lapply(fileNames, 
         function(fileName) data.frame(
           file=basename(fileName), 
           SEN.SLOPE= as.numeric(tail(
             strsplit(grep('SEN SLOPE',
                           readLines(fileName),value=T),"=")[[1]],1)
             )
           )
         )
  )

Hope this helps!!

Community
  • 1
  • 1
Shambho
  • 3,250
  • 1
  • 24
  • 37