1

Is it possible to use python "scipy" linear algebra library from spark/scala script?

I need to use the rich sparse functions in the "scipy" library but my project is already in scala

Francois Saab
  • 77
  • 1
  • 9
  • Unlikely. The `scipy` code is written in Python with heavy use of `numpy`. `numpy` has a lot of compiled code. So does the `scipy.sparse`. Some of the linear algrabra stuff uses external compiled libraries, ones can be used by other code. But that's a c/c++/Fortran task. – hpaulj Nov 02 '16 at 00:56
  • Maybe use pyspark for that part? – Elliott Frisch Nov 02 '16 at 02:00
  • I think Breeze is the linear algebra package for Scala – OneCricketeer Nov 02 '16 at 03:34

2 Answers2

1

It's not feasible to use SciPy from Scala, because Python is no JVM language, but there seem to be ways. The closest you might get in pure Scala is to use Scalanlp-Breeze. You can check out their Comparison with Matlab and Numpy. Their is a SparseMatrix datastructure, you should look at that.

Make sure you properly install the native libraries if you want to get the full performance.

Community
  • 1
  • 1
Raphael Roth
  • 26,751
  • 15
  • 88
  • 145
0

I think it is not possible to use scipy if you want to take the concurrent computing advantage of spark, because scipy is not designed for spark.

If you only want to use it in a local machine in scala script, you can try java-python-integration. Or you can use scipy-like lib in Java instead.

Community
  • 1
  • 1