6

I have many unattended batch jobs in R running on a server and I have to analyse job failures after they have run.

I am trying to catch errors to log them and recover from the error gracefully but I am not able to get a stack trace (traceback) to log the code file name and line number of the R command that caused the error. A (stupid) reproducible example:

f <- function() {
  1 + variable.not.found    # stupid error
}

tryCatch( f(), error=function(e) {
  # Here I would log the error message and stack trace (traceback)
  print(e)            # error message is no problem
  traceback()         # stack trace does NOT work
  # Here I would handle the error and recover...
})

Running the code above produces this output:

simpleError in f(): object 'variable.not.found' not found

No traceback available

The traceback is not available and the reason is documented in the R help (?traceback):

Errors which are caught via try or tryCatch do not generate a traceback, so what is printed is the call sequence for the last uncaught error, and not necessarily for the last error.

In other words: Catching an error with tryCatch does kill the stack trace!

How can I

  • handle errors and
  • log the stack trace (traceback) for further examination
  • [optionally] without using undocumented or hidden R internal functions that are not guaranteed to work in the future?

THX a lot!

R Yoda
  • 8,358
  • 2
  • 50
  • 87

2 Answers2

8

Sorry for the long answer but I wanted to summarize all knowledge and references in one answer!

Main issues to be solved

  1. tryCatch "unrolls" the call stack to the tryCatch call so that traceback and sys.calls do no longer contain the full stack trace to identify the source code line that causes an error or warning.

  2. tryCatch aborts the execution if you catch a warning by passing a handler function for the warning condition. If you just want to log a warning you cannot continue the execution as normal.

  3. dump.frames writes the evaluation environments (frames) of the stack trace to allow post-mortem debugging (= examining the variable values visible within each function call) but dump.frames "forgets" to save the workspace too if you set the parameter to.file = TRUE. Therefore important objects may be missing.

  4. Find a simple logging framework since R does not support decent logging out of the box

  5. Enrich the stack trace with the source code lines.

Solution concept

  1. Use withCallingHandlers instead of tryCatch to get a full stack trace pointing to the source code line that throwed an error or warning.

  2. Catch warnings only within withCallingHandlers (not in tryCatch) since it just calls the handler functions but does not change the program flow.

  3. Surround withCallingHandlers with tryCatch to catch and handle errors as wanted.

  4. Use dump.frames with the parameter to.file = FALSE to write the dump into global variable named last.dump and save it into a file together with the global environment by calling save.image.

  5. Use a logging framework, e. g. the package futile.logger.

  6. R does track source code references when you set options(keep.source = TRUE). You can add this option to your .Rprofile file or use a startup R script that sets this option and source your actual R script then.

  7. To enrich the stack trace with the tracked source code lines you can use the undocumented (but widely used) function limitedLabels.

  8. To filter out R internal function calls from stack trace you can remove all calls that have no source code line reference.

Implementation

Code template

Instead of using tryCatch you should use this code snippet:

library(futile.logger)

tryCatch(
  withCallingHandlers(<expression>,
    error = function(e) {
      call.stack <- sys.calls() # is like a traceback within "withCallingHandlers"
      dump.frames()
      save.image(file = "last.dump.rda")
      flog.error(paste(e$message, limitedLabels(call.stack), sep = "\n"))
    }
    warning = <similar to error above>
  }
  error = <catch errors and recover as you would do it normally>
  # warning = <...> # never do this here since it stops the normal execution like an error!
  finally = <your clean-up code goes here>
}

Reusable implementation via a package (tryCatchLog)

I have implemented a simple package with all the concepts mentioned above. It provides a function tryCatchLog using the futile.logger package.

Usage:

library(tryCatchLog)  # or source("R/tryCatchLog.R")

tryCatchLog(<expression>,
            error = function(e) {
              <your error handler>
            })

You can find the free source code at github:

https://github.com/aryoda/tryCatchLog

You could also source the tryCatchLog function instead of using a full blown package.

Example (demo)

See the demo file that provides a lot of comments to explain how it works.

References

Other tryCatch replacements

Other helpful links

http://adv-r.had.co.nz/Exceptions-Debugging.html

A Warning About warning() - avoid R's warning feature

In R, why does withCallingHandlers still stops execution?

How to continue function when error is thrown in withCallingHandlers in R

Can you make R print more detailed error messages?

How can I access the name of the function generating an error or warning?

How do I save warnings and errors as output from a function?

options(error=dump.frames) vs. options(error=utils::recover)

General suggestions for debugging in R

Suppress warnings using tryCatch in R

R Logging display name of the script

Background information about the "srcrefs" attribute (Duncan Murdoch)

get stack trace on tryCatch'ed error in R

R Yoda
  • 8,358
  • 2
  • 50
  • 87
  • 3
    I really wonder why some seemingly simple things are so complicated in R. Usually at the time when an exception handler is called, the call stack should be intact still; only when recovering from the error, some magic is necessary. – U. Windl Aug 04 '17 at 07:54
2

The traceback function can be used to print/save the current stack trace, but you have to specify an integer argument, which is the number of stack frames to omit from the top (can be 0). This can be done inside a tryCatch block or anywhere else. Say this is the content of file t.r:

f <- function() {
    x <- 1
    g()
}
g <- function() {
    traceback(0)
}

When you source this file into R and run f, you get the stack trace:

3: traceback(0) at t.r#7
2: g() at t.r#3
1: f()

which has file name and line number information for each entry. You will get several stack frames originating from the implementation of tryCatch and you can't skip them by specifying a non-zero argument to traceback, yet indeed this will break in case the implementation of tryCatch changes.

The file name and line number information (source references) will only be available for code that has been parsed to keep source references (by default the source'd code, but not packages). The stack trace will always have call expressions.

The stack trace is printed by traceback (no need to call print on it).

For logging general errors, it is sometimes useful to use options(error=), one then does not need to modify the code that causes the errors.

Tomas Kalibera
  • 1,061
  • 9
  • 13
  • Thx for the detailled answer. My problem is still that the `traceback` is empty within a `tryCatch` handler and I want to log/print the traceback only in case of an error. If you modify your example by adding a line at the end of the f() function that causes an error (e. g. `stop("my error")`) and call f() via `tryCatch(f(), error=function(e) print(traceback()))` you still get nothing but a **"No traceback available"**. I am looking for a workaround for this. – R Yoda Oct 12 '16 at 14:09
  • Interesting behaviour, I think we have found a "dirty hack" in R's `traceback` or `tryCatch` implementation: While `traceback()` delivers "No traceback available" a `traceback(0)` delivers a full stack trace by omitting zero elements from the empty stack :-) – R Yoda Oct 12 '16 at 14:23
  • Still: `traceback(0)` delivers in the context of `tryCatch` the wrong stack trace (not the call hierarchy of error location but the call hierarchy of `sys.calls()` which does **not** point to the error line in the source file. Try it: `f <- function() { g() stop("error") } g <- function() { print("g") } tryCatch(f(), error=function(e) traceback(0))` – R Yoda Oct 12 '16 at 14:25
  • This is documented behavior, "traceback()" gives call stack of last uncaught error, "traceback(number)" gives current stack. Yes, you can get the same by sys.calls(). – Tomas Kalibera Oct 12 '16 at 14:46
  • Full ACK, this strange semantics is documented: "The default display is of the stack of the last uncaught error ... If x is numeric, then the current stack is printed". Thanks for your feed back! I am playing around with `withCallingHandlers` currently which is promissing... – R Yoda Oct 12 '16 at 15:00