2

Is there a good way to warn just once in R?

What I currently do is the usual

a_reason_to_warn_has_occured <- FALSE

lapply(data, function(data) {
       result <- do_something(data)
       if (warning_reason)
           a_reason_to_warn_has_occured <- TRUE
       result
})

if (a_reason_to_warn_has_occured)
    warning("This was bad.")

Is there a way to do this with less clutter/boiler-plate code?

I'd really love something like

lapply(data, function(data) {
       result <- do_something(data)
       warn_once_if(warning_reason, "This was bad.")
       result
})

but I'm not sure whether it is even possible to implement this in R.

Eike P.
  • 3,333
  • 1
  • 27
  • 38
  • What I do is to save all warnings and errors and then process afterwards. This helps in finding where the warning/error was, and also it will continue even if an error occurs. See http://stackoverflow.com/q/4948361/210673 – Aaron left Stack Overflow Aug 08 '14 at 15:14
  • Nitpicking: the convention of using dots between words in identifiers is terrible, since it conflicts with S3. Consider using underscores instead, which is becoming more and more common in R code bases. – Konrad Rudolph Aug 08 '14 at 15:24
  • 1
    @KonradRudolph just fixed the naming convention to please you. :) – Eike P. Aug 14 '14 at 15:32

2 Answers2

2

General remarks

I think your solution is fine, and I would probably use that in production code. Nevertheless, if you are interested in another, cooler but possibly more fragile way of doing this, read on.

A solution using non-standard evaluation

It is certainly possible to create a function that takes an expression, and evaluates it, and takes care about warning only once for each reason. You could use it like this:

warn_once(
  lapply(data, function(data) {
     result <- doSomething(data)
     warn_if_first(reason = "bad data argument", message = "This was bad.")
     result
  })
)

It is also possible to do it in the form you suggested, but it is tricky to set the scope in which you want only one warning. E.g. look at these two examples. The first one is your original code.

lapply(data, function(data) {
       result <- doSomething(data)
       warn_if_first(warningReason, "This was bad.")
       result
})

This is easy. You want one warning per the outer lapply block. But if you have the following one:

lapply(data, function(data) {
       result <- doSomething(data)
       sapply(result, function(x) {
           warn_if_first(warningReason, "This was bad.")
       })
       result
})

then (at least with the straightforward implementation of warn_if_first) you will get one warning per sapply call, and there is no easy way to tell warn_if_first if you want one warning per lapply call.

So I suggest the form above, that explicitly specifies the environment in which you will get a single warning.

Implementation

warn_once <- function(..., asis = FALSE) {
  .warnings_seen <- character()
  if (asis) {
    exprs <- list(...)
  } else {
    exprs <- c(as.list(match.call(expand.dots = FALSE)$...))
  }
  sapply(exprs, eval, envir = parent.frame())
}

warn_if_first <- function(reason, ...) {
  ## Look for .warnings_seen
  for (i in sys.nframe():0) {
    warn_env <- parent.frame(i)
    found_it <- exists(".warnings_seen", warn_env)
    if (found_it) { break }
  }
  if (!found_it) { stop("'warn_if_first not inside 'warn_once'") }

  ## Warn if first, and mark the reason
  .warnings_seen <- get(".warnings_seen", warn_env)
  if (! reason %in% .warnings_seen) {
    warning(...)
    .warnings_seen <- c(.warnings_seen, reason)
    assign(".warnings_seen", .warnings_seen, warn_env)
  }
}

Example usage

Let's try it!

warn_once({
  for (i in 1:10) { warn_if_first("foo", "oh, no! foo!") }
  for (i in 1:10) { warn_if_first("bar", "oh, no! bar!") }
  sapply(1:10, function(x) {
    warn_if_first("foo", "oh, no! foo again! (not really)")
    warn_if_first("foobar", "foobar, too!")
  })
  "DONE!"
})

Which outputs

[1] "DONE!"
Warning messages:
1: In warn_if_first("foo", "oh, no! foo!") : oh, no! foo!
2: In warn_if_first("bar", "oh, no! bar!") : oh, no! bar!
3: In warn_if_first("foobar", "foobar, too!") : foobar, too!

and this seems about right. A glitch is that the warning is coming warn_if_first, and not from its calling environment, as it should be, but I have no idea how to fix this. warning also uses non-standard evaluation, so it is not as simple as just doing eval(warning(...), envir = parent.frame()). You can supply call. = FALSE to warning() or to warn_if_first(), and then you will get

[1] "DONE!"
Warning messages:
1: oh, no! foo! 
2: oh, no! bar! 
3: foobar, too! 

which is probably better.

Caution

While I don't see any obvious problems with this implementation, I cannot guarantee that it does not break in some special circumstances. It is very easy to make mistakes with non-standard evaluation. Some base R functions, and also some popular packages like magrittr, also use non-standard evaluation, and then you have to be doubly cautious, because there might be interactions between them.

The variable name I used for the book-keeping, .warnings_seen is special enough, so that it will not interfere with other code most of the time. If you want to be (almost) completely sure, generate a long random string and use that as the variable name instead.

Further reading about scoping

Gabor Csardi
  • 10,705
  • 1
  • 36
  • 53
  • Nice answer! I was also already thinking about that scoping problem. There is a reason you do try{...} catch after all... I'm really looking forward to your implementation as I'm not so much into parsing functionality. I guess you just explicitly add the boiler-plate code at the beginning and the end of warn_once's argument and then replace the call to warning_once sensibly? – Eike P. Aug 08 '14 at 13:48
  • Exactly. It is really not hard to implement it. Unless I am overlooking something, which is certainly possible. – Gabor Csardi Aug 08 '14 at 13:54
  • By the way, what functionality would you use for implementing the version without scoping? I really can't imagine an R way of doing this. Things seem to get really complicated once you have two such statements in a piece of code, i.e. lapply(..., ... warn.once); lapply(..., ... warn.once). How would you distinguish the two calls to warn.once, regardless of the scoping problem that you meantioned? – Eike P. Aug 08 '14 at 14:01
  • You create a temporary variable in the frame `warn_if_once` was called from, a `character` that holds all warning types that were given already. So this would always restrict the warnings to the _caller_ of `warn_if_once`. You could theoretically have an argument that tells how many levels to go up from `warn_if_once`, but I think this is messy, and the other solution is better. This temporary environment is destroyed when the caller of `warn_if_once` quits. If you go up to .GlobalEnv, then you do not get any warnings ever again. (Well, until you explicitly remove the temporary variable.) – Gabor Csardi Aug 08 '14 at 14:09
  • Concerning scoping, I would inject the definition of `warn_if_first` into the scope of the expression to be executed inside `warn_once`. That way, it cannot be called from outside, and the sanity check in it is unnecessary. – Konrad Rudolph Aug 08 '14 at 15:26
  • I think you could do it more simply with a closure. `warn_once <- function(msg) {seen <- FALSE; function(condition) if (!seen && condition) {seen <<- TRUE; warning(msg)}}` – hadley Aug 08 '14 at 18:31
  • Or maybe just `warn_once <- function() {seen <- FALSE; function(...) if (!seen) {seen <<- TRUE; warning(...)}}` – hadley Aug 08 '14 at 18:49
  • @hadley: how would you do that exactly? How do you use `warn_once`? I think if you want to catch downstream warnings, then you cannot use lexical scoping, since the functions you call might not be defined within `warn_once`. That's why you need the dynamic scoping. But every time I think about it, I have to convince myself about it. :) – Gabor Csardi Aug 08 '14 at 18:55
  • @hadley I guess you could define everything using lexical scoping, then it is indeed much simpler. My idea was to define something that is similar to how `suppressWarnings` & co. work. – Gabor Csardi Aug 08 '14 at 21:11
  • Could someone elaborate on what hadley means? Too concise for me... how do I use that function, as Gabor already asked? Having said that: I just added an additional answer with an attempt for a non-scoping solution. As this is all new to me, comments are most welcome. – Eike P. Aug 14 '14 at 15:11
0

Based on the comments and Gabors answer, here is the result of me trying to implement a non-scoping solution. It is based on comparing the tracebacks of the calls to warn_once. Take care as this is just a quick draft and definitively not perfect. For more information, see below.

warn_once <- function(mesg) {
    trace <- traceback(0)
    if (exists(".warnings_shown", sys.frame(1))) {
        warn_list <- get(".warnings_shown", sys.frame(1))
        found_match <- FALSE
        for (warn in warn_list)
            if (all(unlist(Map(`==`, warn, trace))))
                return()
        warn_list[[length(warn_list)+1]] <- trace
        assign(".warnings_shown", warn_list, envir=sys.frame(1))
        warning(mesg)
    } else {
        assign(".warnings_shown", list(trace), envir=sys.frame(1))
        warning(mesg)
    }
}

As a test case, I used ...

func <- function(x) {
  func2(x)
  func2(not(x))
  func2(x)
  func2(not(x))
}

func2 <- function(x) {
  if(x) for(i in 1:3) warn_once("yeah")
  if(not(x)) warn_once("nope")
  warn_once("yeah")
}

func(T)

... which resulted in ...

Warning in warn_once("yeah") : yeah
Warning in warn_once("yeah") : yeah
Warning in warn_once("nope") : nope
Warning in warn_once("yeah") : yeah
Warning in warn_once("yeah") : yeah
Warning in warn_once("yeah") : yeah
Warning in warn_once("nope") : nope
Warning in warn_once("yeah") : yeah

... and a lot of clutter output from the call to traceback.

Notes:

  • I guess it is somehow possible to suppress the output of the calls to traceback(), but I wasn't able to do so.
  • This identifies warnings based on their position in the frame stack, as opposed to identifying them by their warning message, as in Gabors answer. This can be but isn't necessarily desired behavior.
  • From the traceback, one could probably infer the name of the calling function and add it to the warning message, which might be useful.
  • Obviously, an optional parameter could be introduced for specifying the number of levels to go up in the frame stack in searching for ".warnings_shown".

Comments and improvements (just edit!) most welcome.

Eike P.
  • 3,333
  • 1
  • 27
  • 38