Error while reading with feather package

Question

I am reading the scv file of ~50Mb with read_feather of feather package.

While reading the error is generated as follows:

Error in .Call("feather_coldataFeather", PACKAGE = "feather", feather,  : 
negative length vectors are not allowed

I have not found the discription of this error. I used to read another while and such an error had not been found. I am a little bit stumbled with such an error.

Thanks in advance for your hints.

My guess is that if the call is working for other `scv` files, but not this one, that this file may have some problem. Could you check the file manually for correctness? — Tim Biegeleisen, Sep 16 '16 at 11:24
see https://stat.ethz.ch/pipermail/r-help/2015-January/425051.html probably your vector is actually too long. also [this](http://stackoverflow.com/questions/36842263/memory-limits-in-data-table-negative-length-vectors-are-not-allowed) is probably related — Cath, Sep 16 '16 at 11:24
@Cath thanks for the hint. but I am not sure it hit the limit. The actual table is about 82k x 151. So I am re-downloading the new one. As far as I remember I used to have 1,4 mio rows and 35 columns and reading was Ok. — Dimon D., Sep 16 '16 at 11:47
To see if it's a corrupt file (Tim's suggestion) or something R can't deal with (Cath's suggestion) can you try reading it with Python? `pip install feather` at the cmdline then `import feather` and `feather.read_dataframe(path)` in python code? — hrbrmstr, Sep 16 '16 at 11:48
@hrbrmstr Thanks for the additional hint :-) well, I am going to try a new data set to test. If the performance is poor I will try python. However, I would like to solve the issue in R environment. — Dimon D., Sep 16 '16 at 11:51
Unfortunately R & Python are the only two non-Java environments that I know of that can help validate a Feather file and it's possible there's a bug in the R package (there was a bug in Feather itself dealing with huge files, but that's not your issue here). — hrbrmstr, Sep 16 '16 at 11:55
@hrbrmstr I have recompiled the dataset - removed all NA, made sure that numerics are numerics (no non-numeric values). And feather is ok — Dimon D., Sep 20 '16 at 09:47
Nice. Glad it's sorted out, but I'm starting to wonder if R/Python + Feather is worth it vs R/Python + Spark + Parquet. The latter is _alot_ of extra deps, but at least parquet seems to be a more stable format (I'm going through this selection process for work which is one of the reasons I'm bringing it up). — hrbrmstr, Sep 20 '16 at 10:17
@hrbrmstr Thanks for the hint about parquet format. I will take it into account but now I am stuck to good old csv format. I wondered about feather in terms of performance tests I performed ([read/write from/to file](https://rpubs.com/demydd)) — Dimon D., Sep 20 '16 at 12:27

Error while reading with feather package

0 Answers0