Is it possible to monitor the progress of a vectorized operation in R
? E.g. in a loop one can always do if (i %% 10000) print(i)
to see which element the code is currently working on. My gut feeling is "probably not", but may be I'm wrong?

- 5,029
- 4
- 22
- 39
-
Exact code doesn't matter, it's the general concept. Let it be `gsub("hi","lo",vector)`, where `vector = rep("hi",1000000)` – Alexey Ferapontov Jun 09 '16 at 18:04
-
2I would say no like you since the looping in vectorized functions is done at the source code level, and vectorization (oversimplified) is passing one chunk of data and getting one chunk back; with for loops, you pass one chunk for each iteration so you can count the number of chunks (n=10000) but for vectorized operations, you don't have that amount of granularity (n = 1) – rawr Jun 09 '16 at 18:13
1 Answers
In my comment, I asked what your code is and how you achieve vectorization. I think this matters. Although generally speaking, vectorization is achieved by using loops in compiled code, I am not entirely sure of this. Therefore, I would like to be less confident in saying "absolutely no".
However, if you want to track progress at R level, you must be able to get an index, like i
used in an R level for
loop. Now, let's check what most R vectorized functions look like:
> grep
function (pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE,
fixed = FALSE, useBytes = FALSE, invert = FALSE)
{
if (!is.character(x))
x <- structure(as.character(x), names = names(x))
.Internal(grep(as.character(pattern), x, ignore.case, value,
perl, fixed, useBytes, invert))
}
<bytecode: 0xa34dfe0>
<environment: namespace:base>
> gsub
function (pattern, replacement, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
{
if (!is.character(x))
x <- as.character(x)
.Internal(gsub(as.character(pattern), as.character(replacement),
x, ignore.case, perl, fixed, useBytes))
}
In above examples, we see that those vectorized R functions are merely a thin wrapper of compiled code (see the .Internal()
). There are no explicit loop index for you to refer to. Hence for those example functions, tracking progress is not possible.
I suggest you have a look at the particular function you used. That is the best way to convince yourself.
follow up
Originally, I put lapply
in my examples:
> lapply
function (X, FUN, ...)
{
FUN <- match.fun(FUN)
if (!is.vector(X) || is.object(X))
X <- as.list(X)
.Internal(lapply(X, FUN))
}
<bytecode: 0x9c5c464>
<environment: namespace:base>
Then @RichardScriven expressed his view of *apply
family. On stack overflow, these two posts/answers are extremely useful to understanding vectorization issues in R:
Truly, though lapply
calls C code to do the loop, it has to evaluate R function FUN
along the loop. Hence:
- if
FUN
dominates execution time, thenlapply
will not have noticeable advantage over R'sfor
loop. - if
FUN
does so little work, that the loop overhead dominates the execution, thenlapply
will have noticeable advantage over R'sfor
loop, becausefor
loop in C is more "light weighted".
Discussing the performance of lapply
is off-topic in this post, so I will not attach examples for demonstration.

- 71,365
- 17
- 180
- 248