1

For a college project of mine i needed to implement a deeplearning neural network in plain java. After profiling the application i wanted to see if the automatic parallelization using java's stream api would lead to a significant improvement in performance, but i am struggling to transform my old code to a stream based approach.

The method takes a vector (double array), performs a matrix multiplication, then adds a value to each element and finally applies a lambda function (DoubleFunction) to every element.

Here is the old code that i want to replace:

/* e.g.
double[] x = double[100]
int inputNeurons = 100
int outputNeurons = 200
double[][] weights = double[200][100]
double[] biases = double[200]
*/
private double[] output(double[] x) {     
    double[] y = new double[outputNeurons];

    for (int i = 0; i < outputNeurons; i++) {
        double preActivation = 0.;
        for (int j = 0; j < inputNeurons; j++) {
            preActivation += weights[i][j] * x[j];
        }
        preActivation += biases[i];
        y[i] = activation.apply(preActivation);
    }
}

This is what i came up with so far (it does not work):

private double[] output(double[] x) {
    return Arrays.stream(weights).parallel()
            .map(outputNeuron -> IntStream.range(0, outputNeurons)
                    .mapToDouble(i -> IntStream.range(0, inputNeurons)
                            .mapToDouble(j -> x[i] * outputNeuron[i]).sum()
                ).map(activation::apply)
            ).toArray();

Since i don't know streams good enough, i would really appreciate any help!

Konstantin
  • 277
  • 1
  • 2
  • 6

1 Answers1

2

Good attempt but your stream approach is quite off the imperative one. the exact equivalent of your imperative approach is:

return IntStream.range(0, outputNeurons)
                //.parallel() uncomment to see difference in performance
                .mapToDouble(i -> IntStream.range(0, inputNeurons)
                        .mapToDouble(j -> weights[i][j] * x[j]).sum() + biases[i])
                .map(activation::apply)
                .toArray();

Note, there are many factors that influence whether parallel streams will make your code faster or slower than your imperative approach or sequential streams. Thus, you'll need to consider some factors before going parallel.

  • Data size
  • Number of cores
  • Cost per element (meaning time spent executing in parallel and overhead of decomposition and merging)

  • Source data structure

  • Packing (meaning primitive types are faster to operate on than boxed values).

You should also consider reading Should I always use a parallel stream when possible?

Ousmane D.
  • 54,915
  • 8
  • 91
  • 126