0

I have been struggling to write a simple tryCatch() in my code. I had already asked the question last week (how to write tryCatch() function when extracting data from multiple files?) but could not manage to get this working. I am working on thousands of files (with .MOD extension) where I want to extract specific information from each of them. These information will be collected into one excel sheet with each row representing one .MOD file. Following is an example of one .MOD file.

> AR.MOD <- read.table("Sample1ar.MOD", header = FALSE, fill = TRUE)
> AR.MOD
    V1 V2        V3       V4   V5    V6   V7    V8
1 Case  1 23-3-2013 14:47:40                      
2  Run NA                                         
3    R  1    767,96  1647,72 1,78 18,88 0,66 37,33

I have managed to successfully extract information from each .MOD file using the following code:

> AR.MOD.files <- list.files(pattern = "AR.MOD|ar.MOD")
> for (fileName in AR.MOD.files) {
+     AR.MOD <- read.table(fileName, header = FALSE, fill = TRUE)
+     AR.MOD.subset1 <- AR.MOD[c(1), 3:4]
+     names(AR.MOD.subset1) <- c("Col1", "Col2")
+     AR.MOD.subset2 <- AR.MOD[c(3), 3:8]
+     names(AR.MOD.subset2) <- c("Col3", "Col4", "Col5", "Col6", "Col7", "Col8")
+     AR.MOD.final <- merge(AR.MOD.subset1, AR.MOD.subset2)
+     ID <- basename(fileName)
+     AR.MOD.final <- merge (ID, AR.MOD.final)
+     colnames(AR.MOD.final)[colnames(AR.MOD.final)=="x"] <- "ID"
+     if(match(fileName,AR.MOD.files)==1){
+         output.AR.MOD <- AR.MOD.final
+     }else{
+         output.AR.MOD <- rbind(output.AR.MOD,AR.MOD.final)}
+ }
> print(output.AR.MOD)
               ID       Col1     Col2   Col3    Col4 Col5  Col6 Col7  Col8
1   File1AR.MOD  7-12-2010 14:48:51 574,75 1028,04 2,69 11,68 0,62 37,33
2   File2AR.MOD 22-11-2011 11:43:02 536,15 1033,37 2,54 30,04 0,66 40,33
3   File3AR.MOD  8-11-2011 11:48:20 695,46 1616,14 1,20 35,34 0,65 58,00
4   File4AR.MOD 30-11-2010 12:27:08 825,39 1862,94 1,11  8,43 0,68 54,00
5   File5AR.MOD  25-1-2011 11:33:07 582,52 1205,84 2,03  7,32 0,67 44,00

However, sometimes there are about 10-20 files (out of the tens of thousands) that do not contain information in the required format (for example one column V8 missing), and this results in the following error and stops the script.

Error in `[.data.frame`(AR.MOD, c(3), 3:8) : undefined columns selected 

I therefore tried including a tryCatch() function in the above script as follows, so that the script still could continue to run. But I could not manage to do so. Can anyone help me correct the following code, in such a way that the files with errors can be replaced by "Error" in those specific cells?

AR.MOD.files <- list.files(pattern = "AR.MOD|ar.MOD")
for (fileName in AR.MOD.files) {
    tryCatch(
        expr = {
            AR.MOD <- read.table(fileName, header = FALSE, fill = TRUE)
            AR.MOD.subset1 <- AR.MOD[c(1), 3:4]
            names(AR.MOD.subset1) <- c("Col1", "Col2")
            AR.MOD.subset2 <- AR.MOD[c(3), 3:8]
            names(AR.MOD.subset2) <- c("Col3", "Col4", "Col5", "Col6", "Col7", "Col8")
            AR.MOD.final <- merge(AR.MOD.subset1, AR.MOD.subset2)
            ID <- basename(fileName)
            AR.MOD.final <- merge (ID, AR.MOD.final)
            colnames(AR.MOD.final)[colnames(AR.MOD.final)=="x"] <- "ID"
            if(match(fileName,AR.MOD.files)==1){
                output.AR.MOD <- AR.MOD.final
            }else{
                output.AR.MOD <- rbind(output.AR.MOD,AR.MOD.final)}
        }
        error = function(e){
            message("Error")
            print(e)
        }
    )
}
print(output.AR.MOD)

Following is what I want, in case of error in any file:

  > print(output.AR.MOD)
                   ID       Col1     Col2   Col3    Col4 Col5  Col6 Col7  Col8
    1   File1AR.MOD  7-12-2010 14:48:51 574,75 1028,04 2,69 11,68 0,62 37,33
    2   File2AR.MOD 22-11-2011 11:43:02 536,15 1033,37 2,54 30,04 0,66 40,33
    3   File3AR.MOD  8-11-2011 11:48:20 Error  Error   Error Error Error Error
    4   File4AR.MOD 30-11-2010 12:27:08 825,39 1862,94 1,11  8,43 0,68 54,00
    5   File5AR.MOD  25-1-2011 11:33:07 582,52 1205,84 2,03  7,32 0,67 44,00
Letin
  • 1,255
  • 5
  • 20
  • 36
  • 1
    Something like `output.AR.MOD <- lapply(AR.MOD.files, function(fileName) tryCatch({...; }, error = function(e) {as.data.frame(c(list(ID = basename(fileName)),as.list(setNames(rep(NA, 8), paste0("Col", 1:8)))))})` – Roland Sep 10 '19 at 10:39
  • 1
    Then you can use `do.call(rbind, output.AR.MOD)`. – Roland Sep 10 '19 at 10:40
  • Thanks @Roland, it worked for me. I could not retrieve the ID names incase of error because they are NAs too, but it is fine for now, as long as it does the job what I needed. – Letin Sep 11 '19 at 12:03

0 Answers0