Please consider the body of read.table
as a text file, created with the following code:
sink("readTable.txt")
body(read.table)
sink()
Using regular expressions, I'd like to find all function calls of the form foo(a, b, c)
(but with any number of arguments) in "readTable.txt"
. That is, I'd like the result to contain the names of all called functions in the body of read.table
. This includes nested functions of the form
foo(a, bar(b, c))
. Reserved words (return
, for
, etc) and functions that use back-ticks ('=='()
, '+'()
, etc) can be included since I can remove them later.
So in general, I'm looking for the pattern text(
or text (
then possible nested functions like text1(text2(
, but skipping over the text if it's an argument, and not a function. Here's where I'm at so far. It's close, but not quite there.
x <- readLines("readTable.txt")
regx <- "^(([[:print:]]*)\\(+.*\\))"
mat <- regexpr(regx, x)
lines <- regmatches(x, mat)
fns <- gsub(".*( |(=|(<-)))", "", lines)
head(fns, 10)
# [1] "default.stringsAsFactors()" "!missing(text))"
# [3] "\"UTF-8\")" "on.exit(close(file))" "(is.character(file))"
# [6] "(nzchar(fileEncoding))" "fileEncoding)" "\"rt\")"
# [9] "on.exit(close(file))" "\"connection\"))"
For example, in [9]
above, the calls are there, but I do not want file
in the result. Ideally it would be on.exit(close(
.
How can I go about improving this regular expression?