Whether the source code of an R function is preserved internally (via its srcref
attribute) depends on the value of option keep.source
when the function is defined. By source code, I mean the code as entered by the user, with comments, possibly inconsistent indentation, possibly inconsistent spacing around operators, etc.
options(keep.source = FALSE)
f <- function(x) {
## A comment
x +
1}
getSrcref(f)
## NULL # (invisibly)
deparse(f, control = "all")
## [1] "function (x) "
## [2] "{"
## [3] " x + 1"
## [4] "}"
options(keep.source = TRUE)
g <- function(x) {
## A comment
x +
1}
getSrcref(g)
## function(x) {
## ## A comment
## x +
## 1}
deparse(g, control = "all")
## [1] "function(x) {"
## [2] " ## A comment"
## [3] " x +"
## [4] " 1}"
Whether functions in a contributed package retain their source code depends on options passed to R CMD INSTALL
when the package was built from sources (by you or by CRAN). The default is to discard source code, but you can avoid that by installing from sources and setting the --with-keep.source
flag:
install.packages(pkgs, type = "source", INSTALL_opts = "--with-keep.source")
Functions in base packages (base
, stats
, etc.) won't have their source code unless you build R itself from sources with environment variable R_KEEP_PKG_SOURCE
set to yes
—at least, that is what I infer from ?options
. To learn about building R, see the corresponding manual.
Given a function with source references, you can programmatically extract comments from the source code. A quick and dirty approach is pattern matching:
zzz <- deparse(g, control = "all")
grep("#", zzz, value = TRUE)
## [1] " ## A comment"
There can be false positives, though, because the pattern #
also matches strings and non-syntactic names containing the hash character, which aren't comments at all.
grep("#", "\"## Not a comment\"", value = TRUE)
## [1] "\"## Not a comment\""
A much more robust way to extract comments is to examine the parse data for tokens of type COMMENT
:
getParseData(parse(text = zzz), includeText = NA)
## line1 col1 line2 col2 id parent token terminal text
## 23 1 1 4 4 23 0 expr FALSE
## 1 1 1 1 8 1 23 FUNCTION TRUE function
## 2 1 9 1 9 2 23 '(' TRUE (
## 3 1 10 1 10 3 23 SYMBOL_FORMALS TRUE x
## 4 1 11 1 11 4 23 ')' TRUE )
## 20 1 13 4 4 20 23 expr FALSE
## 6 1 13 1 13 6 20 '{' TRUE {
## 8 2 5 2 16 8 20 COMMENT TRUE ## A comment
## 17 3 9 4 3 17 20 expr FALSE
## 10 3 9 3 9 10 12 SYMBOL TRUE x
## 12 3 9 3 9 12 17 expr FALSE
## 11 3 11 3 11 11 17 '+' TRUE +
## 14 4 3 4 3 14 15 NUM_CONST TRUE 1
## 15 4 3 4 3 15 17 expr FALSE
## 16 4 4 4 4 16 20 '}' TRUE }
Clearly, getParseData
returns much more information than you need. Here is a utility that you can use instead, which takes as an argument a function with source references and returns a character vector listing the comments, if any:
getComments <- function(func) {
func <- match.fun(func)
if (is.null(getSrcref(func))) {
stop("'func' has no source references")
}
data <- getParseData(func, includeText = NA)
if (is.null(data)) {
op <- options(keep.source = TRUE, keep.parse.data = TRUE)
on.exit(options(op))
expr <- parse(text = deparse(func, control = "all"))
data <- getParseData(expr, includeText = NA)
}
data$text[data$token == "COMMENT"]
}
getComments(g)
## [1] "## A comment"
h <- function(x) {
## I will comment
## anywhere
######## and with as many hashes
x + 1 # as I want!
}
getComments(h)
## [1] "## I will comment"
## [2] "## anywhere"
## [3] "######## and with as many hashes"
## [4] "# as I want!"
## You will need Rtools on Windows and Command Line Tools on macOS
## to install from sources packages containing C/C++/Fortran code.
## 'lme4' is one such package ... feel free to choose a different one.
install.packages("lme4", type = "source", INSTALL_opts = "--with-keep.source")
getComments(lme4::lmer)
## [1] "## , ...)"
## [2] "## see functions in modular.R for the body .."
## [3] "## back-compatibility kluge"
## [4] "## if (!is.null(list(...)[[\"family\"]])) {"
## [5] "## warning(\"calling lmer with 'family' is deprecated; please use glmer() instead\")"
## [6] "## mc[[1]] <- quote(lme4::glmer)"
## [7] "## if(missCtrl) mc$control <- glmerControl()"
## [8] "## return(eval(mc, parent.frame(1L)))"
## [9] "## }"
## [10] "## update for back-compatibility kluge"
## [11] "## https://github.com/lme4/lme4/issues/50"
## [12] "## parse data and formula"
## [13] "## create deviance function for covariance parameters (theta)"
## [14] "## optimize deviance function over covariance parameters"
## [15] "## prepare output"
AFAIK, there is no convenient mechanism for checking whether C code called by an R function contained comments before it was compiled...
Relevant documentation is a bit scattered, as always. I have found these help pages useful: ?parse
, ?deparse
, ?.deparseOpts
, ?srcref
(and links therein), ?options
, and ?getParseData
.