I think you're suggesting short-circuit logic within ifelse
(and dplyr::if_else
and data.table::fifelse
), and I don't see a way to do it safely across all use-cases. For example, realize that dmy(x)
is a single function call with a vector as its argument; implementing short-circuiting would require that the ifelse
-replacement function know to subset the x
vector and call dmy
on it only on the elements that need it. While it might see logical that one might be able to specify the symbol(s) that need to be handled in this way, that starts complicating it a bit.
I think the best way to really do short-circuit-like processing here is a bit manual, controlling the vectorized elements yourself.
out <- rep(Sys.Date()[NA], length(x))
for (fun in list(dmy, mdy)) {
isna <- is.na(out)
if (all(!isna)) break
out[isna] <- fun(x[isna])
}
# Warning: 2 failed to parse.
# Warning: 1 failed to parse.
out
# [1] "2001-01-20" "2001-02-28" NA
This can be iterated over multiple functions (not just 2), or perhaps arguments for a single function (such as formats to attempt with as.POSIXct
or similar. (More than two would be close to dplyr::case_when
than ifelse
/dplyr::if_else
... which is one of the design benefits of case_when
.)
With each pass through the for
loop, only those elements that still produce an NA
in out
are processed in the next step; once non-NA
, that element is "safe" and not touched again. Once all of out
is non-NA
, the loop breaks even if further candidate functions/formats are unused.
This still has the problem where one element of the ifelse
is a cumulative calculation, requiring the presence of the whole vector before it. That takes a bit more logic and control, and will preclude short-circuiting of the tests before it executes. (It would help to have the cumulative calc done on the first pass or before the for
loop. Without an example, I hope you can see the potential complexity here.)