28

After using it for a while, I really like the Numpy multi-dimensional array. It's helpful to write algorithms with a concise yet readable and fairly general code. I wish to have the same thing in Java. Before coding a multi-dimensional array with a Numpy-like API myself, is there such a thing already ?

[PS] I searched a bit, did not see

Monkey
  • 1,838
  • 1
  • 17
  • 24
  • 1
    If you are referring to the Java *platform*, [Scalala](https://github.com/scalala/Scalala) looks like a good candidate... – Samuel Audet Mar 11 '12 at 08:06
  • 1
    You don't need Scala. Those multi-dimensional data structures you refer to are matricies. You need a linear algebra library, like LA4J. – duffymo Jul 18 '21 at 18:52

10 Answers10

20

The OP is from 2011. So as of end of 2015 I would like mention that there is a new kid in town which claims to be numpy for java -> nd4j. The nice thing is that nd4j is an abstraction layer on top of different libraries like blas. Depending on the size of your matrices there are underlying implementations twice as fast as numpy or jblas. And your code is real platform independent.

KIC
  • 5,887
  • 7
  • 58
  • 98
  • 8
    Unfortunately, even 4 years later, ND4j is quite poorly documented. – user118967 Feb 10 '20 at 04:12
  • 1
    Even an operation as basic as broadcasting is not clearly documented. See for example https://stackoverflow.com/q/60143939 and https://stackoverflow.com/q/42309075 – user118967 Feb 10 '20 at 04:19
  • 4
    Hi, would you mind filing an issue on this where we can see it https://github.com/eclipse/deeplearning4j/issues ? That would be a much more productive discussion. We are happy to address individual needs and know we can always improve (just like any OSS project really) – Adam Gibson Feb 10 '20 at 05:45
  • 1
    nd4j link is broken as of today. – theprogrammer Mar 29 '21 at 02:21
  • 2
    @theprogrammer https://deeplearning4j.konduit.ai/nd4j/basics – rocksNwaves May 08 '21 at 16:39
12

The library Vectorz (https://github.com/mikera/vectorz) offers a fully featured NDArray that is broadly equivalent in functionality to Numpy's NDArray, i.e. it offers the fullowing features:

  • Arbitrary N-dimensional arrays of numeric values (in this case, Java doubles)
  • Lightweight views using strided access for efficient slicing
  • A broad range of mathematical operations with effiecient implementations

It's also very fast: it is much faster then NumPy for most operations, although NumPy may still be faster for certain large matrix operations because it uses the native BLAS libraries to accelerate these.

Here's the NDArray class itself:

https://github.com/mikera/vectorz/blob/develop/src/main/java/mikera/arrayz/NDArray.java

Disclaimer: I'm the author of Vectorz

mikera
  • 105,238
  • 25
  • 256
  • 415
  • Numpy can be compiled with native BLAS support, e.g. OpenBlas or ATLAS. Vectorz takes advantage of some optimized linear algebra library? – mariolpantunes Oct 25 '14 at 09:50
  • Vectorz does not support native libraries - it's deliberately designed to be pure JVM code. There are other Java libraries (e.g. MTJ, JBlas) that can use BLAS. – mikera Oct 28 '14 at 04:37
  • Hi Mike. I found Vectorz very attractive to me. I am working on color format transformation from RGB to HSL in real time and think that Vectorz exactly what I need. Is it possible to apply some mathematical expression to every element in matrix in efficient way? Assuming that a matrix is a representation of the single camera frame, lets say 2D array. – Yuriy Chernyshov Nov 18 '14 at 19:58
  • @Yuriy there is an `applyOp` function that lets you apply arbitrary (including custom) operations to entire arrays. Worth looking into. – mikera Nov 24 '14 at 04:51
  • Why is Vectorz deliberately designed to not use native and/or GPU code? – Aleksandr Dubinsky Jul 13 '15 at 01:26
  • is there any alternative to numpy`s hstack function in Vectorz.I am looking for such implementation. – Ravikant Tiwari Apr 19 '17 at 06:21
  • @AleksandrDubinsky there's nothing to stop Vectorz using the GPU. In fact I have an experimental OpenCL implementation here: https://github.com/mikera/vectorz-opencl – mikera Apr 19 '17 at 07:12
  • 1
    @RavikantTiwari you can use a.join(b,1) to join along columns (with 1 being the column dimension, 0 for rows) – mikera Apr 19 '17 at 07:15
4

You can use numerical libraries for linear algebra; those will have matricies in them. Have a look at Apache Commons Math.

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • 3
    I'm aware of that library. The API is a classic 'vector & matrix' one. I'm looking for one with a Numpy like one, which provides multi-dimensional arrays, unifying vectors and matrices as one entity. Such an approach, based on personal experience, turns out to make many math code easier to write. – Monkey Dec 05 '11 at 00:38
  • Vectorz (see my answer) provides arbitrary multi-dimensional arrays with a single interface abstraction (INDArray) that is is implemented by both vectors and matrices – mikera Dec 22 '13 at 14:30
2

Scala has a wider number of numpy-like libraries, if that counts. (You should even be able to use them from Java.)

BIDMat promises to be both powerful and fast (and GPU-powered).

As already mentioned, there is also Breeze

Aleksandr Dubinsky
  • 22,436
  • 15
  • 82
  • 99
2

This is an old question, but I just thought I'd add these two Java ndarray libraries:

qwerty
  • 810
  • 1
  • 9
  • 26
  • 2
    Just noting that the TF Java NdArray library is maintained by the TensorFlow org but can be added tonany Java project and does not depend on the TensorFlow itself (the equivalent of an `NdArray` backed by TensorFlow is a `Tensor` in TensorFlow Java) – Karl Lessard Jul 28 '21 at 00:26
2

So the closest match seems to be Colt ! http://acs.lbl.gov/software/colt/

It features a multi-dimensional array object, views over an array and your usual linear algebra ! And it's seems to be rather efficient.

Monkey
  • 1,838
  • 1
  • 17
  • 24
  • 3
    Can you please answer, which class of Colt represents multidimensional array? – Dims Nov 10 '14 at 13:17
  • @Dims the interface for multidimensional arrays is AbstractMatrix http://dst.lbl.gov/ACSSoftware/colt/api/cern/colt/matrix/impl/AbstractMatrix.html. Implementations provided in Colt cover only 1, 2 and 3 dimensional cases. – dlegland Nov 27 '17 at 13:04
  • 3
    @dlegland so it is far from being equivalent to `numpy`. – Dims Dec 30 '18 at 23:03
1

Another great option is to use Spark’s DataFrame API.

http://spark.apache.org/docs/latest/sql-programming-guide.html

This gives you a Pandas/Numpy like interface to arrays in Java. Plus the code is inherently parallelizable and can be run on a cluster of machines if your data size increases.

Asim Jalis
  • 884
  • 9
  • 8
1

Java is rather clumsy for nd-arrays (no operator overloading, etc). If Kotlin is OK, you can try Kotlin-NumPy (https://github.com/Kotlin/kotlin-numpy)

nd4j (https://github.com/deeplearning4j/nd4j) was quite popular some time ago, but now it seems not maintained.

iirekm
  • 8,890
  • 5
  • 36
  • 46
1

I'd recommend la4j, an elegant, modern linear algebra library, or JBLAS, another that ports BLAS to Java.

duffymo
  • 305,152
  • 44
  • 369
  • 561
-4

I would say that java has nothing 'like' numpy. numpy is a large mathematical oriented project which does not really fit in java mentality.

It does not meen that there is no good collection libraries in java! Guava has the Table interface with two good implementations, ArrayTable and HashBasedTable. It's more a collection library that a mathematical tool but it's very useful.

For speed and memory efficiency, threre is trove. A collection library that works with primitives.

For maticies operations, JAMA seems good.

As far as I know, you will need to code more and to use more library in java than python.

user983716
  • 1,992
  • 1
  • 22
  • 32