21

Preamble:

R's trace() is a powerful debugging tool, allowing users to "insert debugging code at chosen places in any function". Unfortunately, using it from the command-line can be fairly laborious.

As an artificial example, let's say I want to insert debugging code that will report the between-tick interval calculated by pretty.default(). I'd like to insert the code immediately after the value of delta is calculated, about four lines up from the bottom of the function definition. (Type pretty.default to see where I mean.) To indicate that line, I need to find which step in the code it corresponds to. The answer turns out to be step list(c(12, 3, 3)), which I zero in on by running through the following steps:

as.list(body(pretty.default))
as.list(as.list(body(pretty.default))[[12]])
as.list(as.list(as.list(body(pretty.default))[[12]])[[3]])
as.list(as.list(as.list(body(pretty.default))[[12]])[[3]])[[3]]

I can then insert debugging code like this:

trace(what = 'pretty.default',
      tracer = quote(cat("\nThe value of delta is: ", delta, "\n\n")), 
      at = list(c(12,3,3)))
## Try it
a <- pretty(c(1, 7843))
b <- pretty(c(2, 23))
## Clean up
untrace('pretty.default')

Questions:

So here are my questions: Is there a way to print out a function (or a parsed version of it) with the lines nicely labeled by the steps to which they belong? (According to Venables and Ripley, S-plus has a function tprint() that "produces a numbered listing of the body of a function for use with the at argument of trace", but R seems to have no equivalent.) Alternatively, is there another easier way, from the command line, to quickly set debugging code for a specific line within a function?

Addendum:

I used the pretty.default() example because it is reasonably tame, but with real/interesting functions, repeatedly using as.list() quickly gets tiresome and distracting. Here's an example:

as.list(as.list(as.list(as.list(as.list(as.list(as.list(as.list(as.list(body(#
model.frame.default))[[26]])[[3]])[[2]])[[4]])[[3]])[[4]])[[4]])[[4]])[[3]]
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455

3 Answers3

8

Here's something that works pretty well for pretty.default and model.frame.default.

print.func <- function(func, ...) {
  str(as.list.func(func, ...), comp.str="")
}

as.list.func <- function(func, recurse.keywords = c("{", "if", "repeat", "while", "for", "switch")) {
  as.list.func.recurse(body(func), recurse.keywords)
}

as.list.func.recurse <- function(x, recurse.keywords) {
  x.list <- as.list(x)
  top <- deparse(x.list[[1]])
  if (length(x.list) > 1 && top %in% recurse.keywords) {
    res <- lapply(x.list, as.list.func.recurse, recurse.keywords)
    setNames(res, seq_along(res))
  } else {
    x
  }
}

Results for pretty.default:

> print.func(pretty.default)
List of 13
 1 : symbol {
 2 : language x <- x[is.finite(x <- as.numeric(x))]
 3 :List of 3
  ..$ 1: symbol if
  ..$ 2: language length(x) == 0L
  ..$ 3: language return(x)
 4 :List of 3
  ..$ 1: symbol if
  ..$ 2: language is.na(n <- as.integer(n[1L])) || n < 0L
  ..$ 3: language stop("invalid 'n' value")
 5 :List of 3
  ..$ 1: symbol if
  ..$ 2: language !is.numeric(shrink.sml) || shrink.sml <= 0
  ..$ 3: language stop("'shrink.sml' must be numeric > 0")
 6 :List of 3
  ..$ 1: symbol if
  ..$ 2: language (min.n <- as.integer(min.n)) < 0 || min.n > n
  ..$ 3: language stop("'min.n' must be non-negative integer <= n")
 7 :List of 3
  ..$ 1: symbol if
  ..$ 2: language !is.numeric(high.u.bias) || high.u.bias < 0
  ..$ 3: language stop("'high.u.bias' must be non-negative numeric")
 8 :List of 3
  ..$ 1: symbol if
  ..$ 2: language !is.numeric(u5.bias) || u5.bias < 0
  ..$ 3: language stop("'u5.bias' must be non-negative numeric")
 9 :List of 3
  ..$ 1: symbol if
  ..$ 2: language (eps.correct <- as.integer(eps.correct)) < 0L || eps.correct > 2L
  ..$ 3: language stop("'eps.correct' must be 0, 1, or 2")
 10: language z <- .C("R_pretty", l = as.double(min(x)), u = as.double(max(x)), n = n,      min.n, shrink = as.double(shrink.sml), high.u.fact = as.double(c(high.u.bias,  ...
 11: language s <- seq.int(z$l, z$u, length.out = z$n + 1)
 12:List of 3
  ..$ 1: symbol if
  ..$ 2: language !eps.correct && z$n
  ..$ 3:List of 3
  .. ..$ 1: symbol {
  .. ..$ 2: language delta <- diff(range(z$l, z$u))/z$n
  .. ..$ 3:List of 3
  .. .. ..$ 1: symbol if
  .. .. ..$ 2: language any(small <- abs(s) < 1e-14 * delta)
  .. .. ..$ 3: language s[small] <- 0
 13: symbol s

Results for model.frame.default:

> print.func(model.frame.default)
List of 29
 1 : symbol {
 2 : language possible_newdata <- !missing(data) && is.data.frame(data) && identical(deparse(substitute(data)),      "newdata") && (nr <- nrow(data)) > 0
 3 :List of 3
  ..$ 1: symbol if
  ..$ 2: language !missing(formula) && nargs() == 1 && is.list(formula) && !is.null(m <- formula$model)
  ..$ 3: language return(m)
 4 :List of 3
  ..$ 1: symbol if
  ..$ 2: language !missing(formula) && nargs() == 1 && is.list(formula) && all(c("terms",      "call") %in% names(formula))
  ..$ 3:List of 8
  .. ..$ 1: symbol {
  .. ..$ 2: language fcall <- formula$call
  .. ..$ 3: language m <- match(c("formula", "data", "subset", "weights", "na.action"), names(fcall),      0)
  .. ..$ 4: language fcall <- fcall[c(1, m)]
  .. ..$ 5: language fcall[[1L]] <- as.name("model.frame")
  .. ..$ 6: language env <- environment(formula$terms)
  .. ..$ 7:List of 3
  .. .. ..$ 1: symbol if
  .. .. ..$ 2: language is.null(env)
  .. .. ..$ 3: language env <- parent.frame()
  .. ..$ 8: language return(eval(fcall, env, parent.frame()))
 5 :List of 4
  ..$ 1: symbol if
  ..$ 2: language missing(formula)
  ..$ 3:List of 3
  .. ..$ 1: symbol {
  .. ..$ 2:List of 3
  .. .. ..$ 1: symbol if
  .. .. ..$ 2: language !missing(data) && inherits(data, "data.frame") && length(attr(data, "terms"))
  .. .. ..$ 3: language return(data)
  .. ..$ 3: language formula <- as.formula(data)
  ..$ 4:List of 3
  .. ..$ 1: symbol if
  .. ..$ 2: language missing(data) && inherits(formula, "data.frame")
  .. ..$ 3:List of 4
  .. .. ..$ 1: symbol {
  .. .. ..$ 2:List of 3
  .. .. .. ..$ 1: symbol if
  .. .. .. ..$ 2: language length(attr(formula, "terms"))
  .. .. .. ..$ 3: language return(formula)
  .. .. ..$ 3: language data <- formula
  .. .. ..$ 4: language formula <- as.formula(data)
 6 : language formula <- as.formula(formula)
 7 :List of 3
  ..$ 1: symbol if
  ..$ 2: language missing(na.action)
  ..$ 3:List of 2
  .. ..$ 1: symbol {
  .. ..$ 2:List of 4
  .. .. ..$ 1: symbol if
  .. .. ..$ 2: language !is.null(naa <- attr(data, "na.action")) & mode(naa) != "numeric"
  .. .. ..$ 3: language na.action <- naa
  .. .. ..$ 4:List of 3
  .. .. .. ..$ 1: symbol if
  .. .. .. ..$ 2: language !is.null(naa <- getOption("na.action"))
  .. .. .. ..$ 3: language na.action <- naa
 8 :List of 4
  ..$ 1: symbol if
  ..$ 2: language missing(data)
  ..$ 3: language data <- environment(formula)
  ..$ 4:List of 4
  .. ..$ 1: symbol if
  .. ..$ 2: language !is.data.frame(data) && !is.environment(data) && !is.null(attr(data, "class"))
  .. ..$ 3: language data <- as.data.frame(data)
  .. ..$ 4:List of 3
  .. .. ..$ 1: symbol if
  .. .. ..$ 2: language is.array(data)
  .. .. ..$ 3: language stop("'data' must be a data.frame, not a matrix or an array")
 9 :List of 3
  ..$ 1: symbol if
  ..$ 2: language !inherits(formula, "terms")
  ..$ 3: language formula <- terms(formula, data = data)
 10: language env <- environment(formula)
 11: language rownames <- .row_names_info(data, 0L)
 12: language vars <- attr(formula, "variables")
 13: language predvars <- attr(formula, "predvars")
 14:List of 3
  ..$ 1: symbol if
  ..$ 2: language is.null(predvars)
  ..$ 3: language predvars <- vars
 15: language varnames <- sapply(vars, function(x) paste(deparse(x, width.cutoff = 500),      collapse = " "))[-1L]
 16: language variables <- eval(predvars, data, env)
 17: language resp <- attr(formula, "response")
 18:List of 3
  ..$ 1: symbol if
  ..$ 2: language is.null(rownames) && resp > 0L
  ..$ 3:List of 3
  .. ..$ 1: symbol {
  .. ..$ 2: language lhs <- variables[[resp]]
  .. ..$ 3: language rownames <- if (is.matrix(lhs)) rownames(lhs) else names(lhs)
 19:List of 3
  ..$ 1: symbol if
  ..$ 2: language possible_newdata && length(variables)
  ..$ 3:List of 3
  .. ..$ 1: symbol {
  .. ..$ 2: language nr2 <- max(sapply(variables, NROW))
  .. ..$ 3:List of 3
  .. .. ..$ 1: symbol if
  .. .. ..$ 2: language nr2 != nr
  .. .. ..$ 3: language warning(gettextf("'newdata' had %d rows but variable(s) found have %d rows",      nr, nr2), call. = FALSE)
 20:List of 3
  ..$ 1: symbol if
  ..$ 2: language is.null(attr(formula, "predvars"))
  ..$ 3:List of 3
  .. ..$ 1: symbol {
  .. ..$ 2:List of 4
  .. .. ..$ 1: symbol for
  .. .. ..$ 2: symbol i
  .. .. ..$ 3: language seq_along(varnames)
  .. .. ..$ 4: language predvars[[i + 1]] <- makepredictcall(variables[[i]], vars[[i + 1]])
  .. ..$ 3: language attr(formula, "predvars") <- predvars
 21: language extras <- substitute(list(...))
 22: language extranames <- names(extras[-1L])
 23: language extras <- eval(extras, data, env)
 24: language subset <- eval(substitute(subset), data, env)
 25: language data <- .Internal(model.frame(formula, rownames, variables, varnames, extras,      extranames, subset, na.action))
 26:List of 4
  ..$ 1: symbol if
  ..$ 2: language length(xlev)
  ..$ 3:List of 2
  .. ..$ 1: symbol {
  .. ..$ 2:List of 4
  .. .. ..$ 1: symbol for
  .. .. ..$ 2: symbol nm
  .. .. ..$ 3: language names(xlev)
  .. .. ..$ 4:List of 3
  .. .. .. ..$ 1: symbol if
  .. .. .. ..$ 2: language !is.null(xl <- xlev[[nm]])
  .. .. .. ..$ 3:List of 4
  .. .. .. .. ..$ 1: symbol {
  .. .. .. .. ..$ 2: language xi <- data[[nm]]
  .. .. .. .. ..$ 3:List of 3
  .. .. .. .. .. ..$ 1: symbol if
  .. .. .. .. .. ..$ 2: language is.character(xi)
  .. .. .. .. .. ..$ 3:List of 3
  .. .. .. .. .. .. ..$ 1: symbol {
  .. .. .. .. .. .. ..$ 2: language xi <- as.factor(xi)
  .. .. .. .. .. .. ..$ 3: language warning(gettextf("character variable '%s' changed to a factor", nm), domain = NA)
  .. .. .. .. ..$ 4:List of 4
  .. .. .. .. .. ..$ 1: symbol if
  .. .. .. .. .. ..$ 2: language !is.factor(xi) || is.null(nxl <- levels(xi))
  .. .. .. .. .. ..$ 3: language warning(gettextf("variable '%s' is not a factor", nm), domain = NA)
  .. .. .. .. .. ..$ 4:List of 5
  .. .. .. .. .. .. ..$ 1: symbol {
  .. .. .. .. .. .. ..$ 2: language xi <- xi[, drop = TRUE]
  .. .. .. .. .. .. ..$ 3: language nxl <- levels(xi)
  .. .. .. .. .. .. ..$ 4:List of 3
  .. .. .. .. .. .. .. ..$ 1: symbol if
  .. .. .. .. .. .. .. ..$ 2: language any(m <- is.na(match(nxl, xl)))
  .. .. .. .. .. .. .. ..$ 3: language stop(gettextf("factor '%s' has new level(s) %s", nm, paste(nxl[m], collapse = ", ")),      domain = NA)
  .. .. .. .. .. .. ..$ 5: language data[[nm]] <- factor(xi, levels = xl, exclude = NULL)
  ..$ 4:List of 3
  .. ..$ 1: symbol if
  .. ..$ 2: symbol drop.unused.levels
  .. ..$ 3:List of 2
  .. .. ..$ 1: symbol {
  .. .. ..$ 2:List of 4
  .. .. .. ..$ 1: symbol for
  .. .. .. ..$ 2: symbol nm
  .. .. .. ..$ 3: language names(data)
  .. .. .. ..$ 4:List of 3
  .. .. .. .. ..$ 1: symbol {
  .. .. .. .. ..$ 2: language x <- data[[nm]]
  .. .. .. .. ..$ 3:List of 3
  .. .. .. .. .. ..$ 1: symbol if
  .. .. .. .. .. ..$ 2: language is.factor(x) && length(unique(x[!is.na(x)])) < length(levels(x))
  .. .. .. .. .. ..$ 3: language data[[nm]] <- data[[nm]][, drop = TRUE]
 27: language attr(formula, "dataClasses") <- sapply(data, .MFclass)
 28: language attr(data, "terms") <- formula
 29: symbol data
Michael Hoffman
  • 32,526
  • 7
  • 64
  • 86
  • Thanks. This is some clever code. Unfortunately, R's print method for lists blows the results so far apart that this isn't as helpful as it would otherwise be. Do you have any ideas about how the code might be printed compactly, with the list subscripts set off to the right on the same line as their corresponding code? – Josh O'Brien Jul 04 '12 at 07:50
  • @JoshO'Brien I've made the output more compact using `setNames()` and `str()`. The format could probably be improved further but I'm going to leave that as an exercise to the reader. – Michael Hoffman Jul 04 '12 at 18:09
  • Super. Using `setNames()` to label the parts is a great idea. I may play around with some of the options to `str()`, and if I come up with something better I will (with your permission) edit the post accordingly. Thanks again. – Josh O'Brien Jul 04 '12 at 18:51
  • 3
    Your function has been so helpful to me over the past year or so that I'm putting up a bounty as additional thanks and on the off-chance that it'll help somebody else. (As a side note, S-plus apparently had a function `tprint()` that did something like what you cooked up. According to Venables and Ripley, it "produce[d] a numbered listing of the body of a function for use with the at argument of trace".) – Josh O'Brien Mar 22 '13 at 17:17
  • 1
    Thank you! I've come back to use the function four times at least, and each time it has worked smoothly (where kohske's and my answers generally don't). I thought that deserved a shout out! – Josh O'Brien Mar 25 '13 at 21:23
8

Here is a convenient wrapper for detecting the piece:

library(codetools)
ff <- function(f, tar) {
  cc <- function(e, w) {
    if(length(w$pos) > 0 &&
      grepl(w$tar, paste(deparse(e), collapse = "\n"), fixed = TRUE)) {
      cat(rev(w$pos), ": ", deparse(e), "\n")
      w$ret$vals <- c(w$ret$vals, list(rev(w$pos)))
    }
    w$pos <- c(0, w$pos)
    for (ee in as.list(e)){
      if (!missing(ee)) {      
        w$pos[1] <- w$pos[1] + 1
        walkCode(ee, w)
      }
    }
  }

  w <- list(pos = c(),
            tar = tar,
            ret = new.env(),
            handler = function(v, w) NULL,
            call = cc,
            leaf = function(e, w) NULL)
  walkCode(body(f), w = w)
  w$ret$vals
}

and then,

> r <- ff(pretty.default, "delta <- diff(range(z$l, z$u))/z$n")
12 :  if (!eps.correct && z$n) {     delta <- diff(range(z$l, z$u))/z$n     if (any(small <- abs(s) < 1e-14 * delta))          s[small] <- 0 } 
12 3 :  {     delta <- diff(range(z$l, z$u))/z$n     if (any(small <- abs(s) < 1e-14 * delta))          s[small] <- 0 } 
12 3 2 :  delta <- diff(range(z$l, z$u))/z$n 
> r
[[1]]
[1] 12

[[2]]
[1] 12  3

[[3]]
[1] 12  3  2

> r <- ff(model.frame.default, "stop(gettextf(\"factor '%s' has new level(s) %s\", nm, paste(nxl[m],")
26 3 2 4 3 4 4 4 3 :  stop(gettextf("factor '%s' has new level(s) %s", nm, paste(nxl[m],      collapse = ", ")), domain = NA) 
> r
[[1]]
[1] 26  3  2  4  3  4  4  4  3

and you can define the tracer by contents:

traceby <- function(fun, tar, cer) {
  untrace(deparse(substitute(fun)))
  r <- ff(fun, tar)
  r <- r[which.max(sapply(r, length))]
  trace(what = deparse(substitute(fun)), tracer = cer, at = r)
}

then,

> traceby(pretty.default, "if (any(small <- abs(s) < 1e-14 * delta)) s[small] <- 0", quote(cat("\nThe value of delta is: ", delta, "\n\n")))
Untracing function "pretty.default" in package "base"
12 3 3 :  if (any(small <- abs(s) < 1e-14 * delta)) s[small] <- 0 
Tracing function "pretty.default" in package "base"
[1] "pretty.default"
> a <- pretty(c(1, 7843))
Tracing pretty.default(c(1, 7843)) step 12,3,3 

The value of delta is:  2000 

> b <- pretty(c(2, 23))
Tracing pretty.default(c(2, 23)) step 12,3,3 

The value of delta is:  5 
kohske
  • 65,572
  • 8
  • 165
  • 155
  • Thanks. It looks like this works well, though I confess I've got no idea what `walkCode()` does or how it works. (Neither `?walkCode` nor `walkCode` clarify matters much.) Can you recommend any helpful intros to the `codetools` package, or other resources that would clarify what `walkCode` is doing? – Josh O'Brien Jul 04 '12 at 07:31
  • The best place to start ([as mentioned here by kohske](https://github.com/yihui/formatR/pull/20#issuecomment-6715938)) is to look at the following 5 functions from the **codetools** package: `walkCode`, `makeCodeWalker`, `showTree`, `codetools:::showTreeCall`, and `codetools:::showTreeLeaf`. – Josh O'Brien Jun 22 '13 at 21:59
1

Here's an approach which takes advantage of the fact that findLineNum() in the utils package can be used to determine the step corresponding to a specified line in a given source file.

getStep <- function(fun, txt) {
    ## Create a text file into which the function can dumped
    ## and from which it can then be sourced
    tmpfile <- tempfile() 
    on.exit(unlink(tmpfile))
    dump(fun, file = tmpfile)
    ## Find the line containing the code of interest
    lines <- readLines(tmpfile)
    matchlines <- grepl(txt, lines, fixed=TRUE)
    if(sum(matchlines) > 1) {
        stop(paste(dQuote(txt), "matches more than one line in", fun))
    }
    linenum <- which(matchlines)
    ## Use findLineNum() to determine the step corresponding to that line
    source(tmpfile)
    Step <- list(findLineNum(tmpfile, line=linenum)[[1]]$at)
    ## Clean up and return
    rm(list = fun, envir = .GlobalEnv)
    return(Step)
}

## Test it
getStep(fun = "pretty.default",
        txt = "if (any(small <- abs(s) < 1e-14 * delta))")
# [[1]]
# [1] 6  3  3

It's then a small step to incorporate getStep() in a function that inserts debugging code in function fun at the line matching txt.

## Define the function
traceLine <- function(fun, txt, tracer) {
    Step <- getStep(fun = deparse(substitute(fun)), txt = txt)
    trace(what = substitute(fun),
          tracer = tracer,
          at = Step)
}

## Confirm that it works.
traceLine(fun = pretty.default,
          txt = "if (any(small <- abs(s) < 1e-14 * delta))",
          tracer = quote(cat("\nThe value of delta is: ", delta, "\n\n")))
a <- pretty(c(1, 7843))
b <- pretty(c(2, 23))
untrace(pretty.default)
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455