Ben Bolkers' answer to this question and the article by Uwe Ligges are already very useful
when I try to "decode" a primitive or internal R function.
But how is a primitive R function connected with its corresponding C function?
I guess that somehow .Primitive
must provide this missing link.
Take for example is.na
:
> is.na
function (x) .Primitive("is.na")
FUNTAB R_FunTab[]
in file "names.c" contains
{"is.na", do_isna, 0, 1, 1, {PP_FUNCALL, PREC_FN, 0}},
which means that is.na
uses the C function do_isna
.
do_isna
is defined in file "coerce.c":
SEXP attribute_hidden do_isna(SEXP call, SEXP op, SEXP args, SEXP rho)
{
SEXP ans, dims, names, x;
R_xlen_t i, n;
checkArity(op, args);
check1arg(args, call, "x");
if (DispatchOrEval(call, op, "is.na", args, rho, &ans, 1, 1))
return(ans);
PROTECT(args = ans);
#ifdef stringent_is
if (!isList(CAR(args)) && !isVector(CAR(args)))
errorcall_return(call, "is.na " R_MSG_list_vec);
#endif
x = CAR(args);
n = xlength(x);
PROTECT(ans = allocVector(LGLSXP, n));
if (isVector(x)) {
PROTECT(dims = getAttrib(x, R_DimSymbol));
if (isArray(x))
PROTECT(names = getAttrib(x, R_DimNamesSymbol));
else
PROTECT(names = getAttrib(x, R_NamesSymbol));
}
else dims = names = R_NilValue;
switch (TYPEOF(x)) {
case LGLSXP:
for (i = 0; i < n; i++)
LOGICAL(ans)[i] = (LOGICAL(x)[i] == NA_LOGICAL);
break;
case INTSXP:
for (i = 0; i < n; i++)
LOGICAL(ans)[i] = (INTEGER(x)[i] == NA_INTEGER);
break;
case REALSXP:
for (i = 0; i < n; i++)
LOGICAL(ans)[i] = ISNAN(REAL(x)[i]);
break;
case CPLXSXP:
for (i = 0; i < n; i++)
LOGICAL(ans)[i] = (ISNAN(COMPLEX(x)[i].r) ||
ISNAN(COMPLEX(x)[i].i));
break;
case STRSXP:
for (i = 0; i < n; i++)
LOGICAL(ans)[i] = (STRING_ELT(x, i) == NA_STRING);
break;
/* Same code for LISTSXP and VECSXP : */
#define LIST_VEC_NA(s) \
if (!isVector(s) || length(s) != 1) \
LOGICAL(ans)[i] = 0; \
else { \
switch (TYPEOF(s)) { \
case LGLSXP: \
case INTSXP: \
LOGICAL(ans)[i] = (INTEGER(s)[0] == NA_INTEGER); \
break; \
case REALSXP: \
LOGICAL(ans)[i] = ISNAN(REAL(s)[0]); \
break; \
case STRSXP: \
LOGICAL(ans)[i] = (STRING_ELT(s, 0) == NA_STRING); \
break; \
case CPLXSXP: \
LOGICAL(ans)[i] = (ISNAN(COMPLEX(s)[0].r) \
|| ISNAN(COMPLEX(s)[0].i)); \
break; \
default: \
LOGICAL(ans)[i] = 0; \
} \
}
case LISTSXP:
for (i = 0; i < n; i++) {
LIST_VEC_NA(CAR(x));
x = CDR(x);
}
break;
case VECSXP:
for (i = 0; i < n; i++) {
SEXP s = VECTOR_ELT(x, i);
LIST_VEC_NA(s);
}
break;
case RAWSXP:
/* no such thing as a raw NA */
for (i = 0; i < n; i++)
LOGICAL(ans)[i] = 0;
break;
default:
warningcall(call, _("%s() applied to non-(list or vector) of type '%s'"),
"is.na", type2char(TYPEOF(x)));
for (i = 0; i < n; i++)
LOGICAL(ans)[i] = 0;
}
if (dims != R_NilValue)
setAttrib(ans, R_DimSymbol, dims);
if (names != R_NilValue) {
if (isArray(x))
setAttrib(ans, R_DimNamesSymbol, names);
else
setAttrib(ans, R_NamesSymbol, names);
}
if (isVector(x))
UNPROTECT(2);
UNPROTECT(1);
UNPROTECT(1); /*ans*/
return ans;
}
But if we want to evaluate is.na(x=3)
for example, how are the arguments
call
,op
,args
,rho
generated?
At least some external information must be used, x=3
is not enough.
Moreover, at first glance x=3
is not used at all, which must be wrong of course:
> is.na
function (x) .Primitive("is.na")
The R Code of .Primitive
doesn't give a hint:
> .Primitive
function (name) .Primitive(".Primitive")
Taking all this into account, it is not surprising that an apparently excellent copy isNA
of is.na
fails:
> isNA <- function (x) .Primitive("is.na")
> isNA
function (x) .Primitive("is.na")
> is.na
function (x) .Primitive("is.na")
> isNA(x=3)
function (x) .Primitive("is.na")
> is.na(x=3)
[1] FALSE
To put it straight:
All of the C functions do_...
have these arguments
call
,op
,args
,rho
.
By what formula are they calculated when a primitive R function is called?