4

I am looking for numeric computation tooling on the JVM. My major requirements are expressiveness/readability, ease of use, evaluation and features in terms of mathematical functions. I guess I am after something like the Matlab kernel (probably including some basic libraries and w/o graphics) on the JVM. I'd like to be able to "throw" computional code at a running JVM and want this code to be evaluated. I don't want to worry about types. Arbitrary precision and performance is not so important.

I guess there are some nice libraries out there but I think an appropriate language on top is needed to get the expressiveness.

Which tooling would you guys suggest to address expressive, feature rich numeric computation on the JVM ?

Andreas Steffan
  • 6,039
  • 2
  • 23
  • 25
  • I never had the chance to perform a comprehensive comparison of all pure java matrix libraries but I quickly got useful results by the `SimpleMatrix`-usage of [EJML](http://code.google.com/p/efficient-java-matrix-library/). – bluenote10 Mar 14 '13 at 17:14

6 Answers6

3

Most of Mathworks Matlab is built on the Intel Math Kernel Library (MKL), which is (IMHO) the unbeatable champion in linear algebra computations. There is java support, but it costs 500 dollar (the MKL, not just the java support)...

Best second option if you want to use java is jblas, which uses BLAS and LAPACK, the industry standards for linear algebra.

Pure java libraries' performances are horrible apparently, see here...

Community
  • 1
  • 1
reverse_engineer
  • 4,239
  • 4
  • 18
  • 27
  • I would be a bit more careful with recommending jblas over pure java libraries. There are a lot of things to take into account (specific problems, size of matrices, parallel vs single core algorithms). [This benchmark](http://code.google.com/p/java-matrix-benchmark/) might help with the decision. Also, note that the OP does not focus on performance and in terms of ease-of-use there might be better options. – bluenote10 Mar 14 '13 at 17:13
  • @bluenote10 Yeah I agree that he is not be looking for performance but I always think about it for some reason :)... But on the performance of pure java I can tell you that in practice hand-optimized native code is unbeatable (except if what [ATLAS](http://math-atlas.sourceforge.net/) is saying is true :)). Then comes automatically-optimized code, like ATLAS, which jblas uses, then pure Java. Just check the large matrix performance of jblas (what it's designed for) in your benchmark. Through the roof... – reverse_engineer Mar 14 '13 at 17:59
  • Yes, native code is faster in general but it is important to remember that it is not unbeatable. Say the problem is to add two medium size matrices on a single core hardware. In this case EJML beats jblas by almost a factor of 10. Since we cannot assume the OP exclusively has to deal with huge matrices, I would be more careful with such a generalization. – bluenote10 Mar 15 '13 at 08:19
  • @bluenote10 Agreed, but that's because jblas is made for large matrices (since there's overhead in preprocessing to optimize). But do you think that an implementation in c++ of the EJML algorithm that does this addition would be faster or slower? I ask because I really don't know. A lot of java people say the Hotspot VM can optimize so well that you can see performance gains in java vs native, but in practice, I never really saw that... – reverse_engineer Mar 15 '13 at 08:40
  • My general bet would also be on C++. But again, there might be specific problems/cases where a Java implementation could be on par or faster. – bluenote10 Mar 15 '13 at 09:02
3

From the jGroovyLab page:

The GroovyLab environment aims to provide a Matlab/Scilab like scientific computing platform that is supported by a scripting engine implemented in Groovy language. The GroovyLab user can work either with a Matlab-lke command console, or with a flexible editor based on the jsyntaxpane (http://code.google.com/p/jsyntaxpane/) component, that offers more convenient code development. Also, GroovyLab supports Computer Algebra based on the symja (http://code.google.com/p/symja/) project.

And there is also GroovyLab:

GroovyLab is a collection of Groovy classes to provide matlab-like syntax and basic features (linear algebra, 2D/3D plots). It is based on jmathplot and jmatharray libs:

Groovy has a smooth learning curve for Java programmers and a flexible syntax similar to Ruby. It is also pretty easy to write a DSL on it.

Though Groovy's performance is pretty good for a dynamic language, you can use static compilation if you are in the need for it.

Will
  • 14,348
  • 1
  • 42
  • 44
2

Spire sounds like it's aiming at the area you're looking at. It takes advantage of a lot of recent scala features such as macros to get decent performance without having to sacrifice the expressiveness of being in a high level language.

There's also breeze, which is targeted at machine learning but includes a fair amount of linear algebra stuff.

Impredicative
  • 5,039
  • 1
  • 17
  • 43
2

Depending how much work you want to get into and what languages you're already familiar with, Incanter in the Clojure world might be worth a look. Also quickly evolving in Clojure right now is core.matrix, which aims to encapsulate high-level common abstractions in linear algebra implemented with various methods or packages.

You highlighted expressiveness in your post, and the nice thing about Clojure is that, as a Lisp, it is possible to make or extend DSLs to closely match problem domains. This is one of the big draws of the language (and of Lisps in general).

JohnJ
  • 4,753
  • 2
  • 28
  • 40
2

I'm the original author of core.matrix for Clojure. So I have a clear affiniy and much more knowledge in this specific space. That said, I'm still going to try and give you an honest answer :-)

I was the the same position as you a year or so back, looking for a solution for numeric computation that would be scalable, flexible and suitable for deployment as a clustered cloud service.

I ended up going with Clojure for the following reasons:

  • Functional Programming: Clojure is a functional programming language at heart, more so than most other language (although not as much as Haskell....). Lazy infinite sequences, persistent data structures, immutability throughout etc. Makes for elegany code when you are dealing with big computations.
  • Metaprogramming: I saw a need to do code generation for vector / computational experessions. Hence being a Lisp was a big plus: once you have done code generation in a homoiconic language with a "whole language" macro system then it's hard to find anything else that comes close.
  • Concurrency - Clojure has an impressive and movel approach to multi-code concurrency. If you haven't seen it then watch: http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey
  • Interactive REPL: Something I've always felt is very important for data work. You want to be able to work with your code / data "live" to get a real feel for its properties. Having a dynamically typed language with an interactive REPL works wonders here.
  • JVM based: big advantage for pragmantic purposes, because of the huge library / tool ecosystem and the excellent engineering in the JVM as a runtime platform.
  • Community: I saw a lot of innovation going on in Clojure, particularly around the general area of data and analytics.

The main thing Clojure was lacking at that time was a good library / API for matrix operations. There were some nice tools in Incanter, but they weren't very general purpose or performant. Hence I started developing core.matrix, which is shaping up to be an idiomatic Clojure-flavoured equivalent of NumPY / SciPY. Right now it is still work in progress but good enough for production use if you are careful.

In terms of low-level matrix support, I also maintain vectorz-clj, which is my attempt to provide a core.mattrix implementation that offers high performance vector/matrix operations while remaining Pure Java (i.e. no native dependencies). If you are interested in the performance of this, you may like to see:

My second choice after Clojure would have been Scala. I liked Scala's slightly greater maturity and decent static type system. Both the languages are JVM based so the library / tool side was a tie. It was probably the Lisp features that clinched it.

mikera
  • 105,238
  • 25
  • 256
  • 415
  • I have been looking at clojure and I am aware of its strengths. Especially its metaprogramming capabilities. I had the impression that by picking clojure, I would first have to suffer trough writing a bunch of macros in order to get expressiveness enabling mortal people to understand the computation. I maybe wrong and the macros already exist somewhere. – Andreas Steffan Mar 15 '13 at 16:57
0

If you happen to have access to Mathematica, then it's fairly easy to get it working with the JVM by means of J/Link. For Clojure, Clojuratica is an excellent library to make that as seemless as possible, although it's not been maintained for a while and it may take some effort to get it working in modern environments again.

Daniel Janus
  • 637
  • 6
  • 10