Now fixed in v1.8.9 on R-Forge.
- An unintended 50,000 column limit has been removed in
fread
. Thanks to mpmorley for reporting. Test added.
The reason was I got this part wrong in the fread.c source :
// *********************************************************************
// Allocate columns for known nrow
// *********************************************************************
ans=PROTECT(allocVector(VECSXP,ncol));
protecti++;
setAttrib(ans,R_NamesSymbol,names);
for (i=0; i<ncol; i++) {
thistype = TypeSxp[ type[i] ];
thiscol = PROTECT(allocVector(thistype,nrow)); // ** HERE **
protecti++;
if (type[i]==SXP_INT64)
setAttrib(thiscol, R_ClassSymbol, ScalarString(mkChar("integer64")));
SET_TRUELENGTH(thiscol, nrow);
SET_VECTOR_ELT(ans,i,thiscol);
}
According to R-exts section 5.9.1, that PROTECT inside the loop isn't needed :
In some cases it is necessary to keep better track of whether protection is really needed. Be
particularly aware of situations where a large number of objects are generated. The pointer
protection stack has a fixed size (default 10,000) and can become full. It is not a good idea
then to just PROTECT everything in sight and UNPROTECT several thousand objects at the end. It
will almost invariably be possible to either assign the objects as part of another object (which
automatically protects them) or unprotect them immediately after use.
So that PROTECT is now removed and all is well. (It seems that the pointer protection stack limit has been reduced to 50,000 since that text was written; Defn.h contains #define R_PPSSIZE 50000L
.) I've checked all other PROTECTs in data.table C source for anything similar and found and fixed one in assign.c too (when adding more than 50,000 columns by reference), no others.
Thanks for reporting!