I'm looking for a fast svd library, in either c, c++ or java. Ultimately I'm using Java, but I'm very comfortable using jna to wrap c++, eg http://github.com/hughperkins/jeigen
I'm looking for a fast svd library that will handle sparse matrices. To keep this objective, so that the question doesn't get marked as too subjective, let's say:
- targeting use with news20.binary , eg from http://mldata.org/repository/data/viewslug/news20binary/
- how fast does it take to run?
- how much variance is conserved, eg for an S matrix of size 6 or 20?
I looked around at a few libraries and found:
- matlab: super fast, about 10 seconds, but it's not really a 'library' as such. average squared projection error: 0.93
- redsvd: super fast, about 1 second to run, for 6 features, but the average squared projection error is 0.97, which is very high
- Eigen's svd is both very slow, and only for dense matrices
- svdlibc: ran for 28 minutes before I stopped it; I guess it's calculating the full S, rather than just the first 6 features or so
Basically, I'm looking for a library that gives about the same speed and average squared projection error as matlab, or at least, somewhat comparable.