1

I generate int numbers with and without IntSteam, surprising to me is the fact that on my machine my code runs slower with IntStream. Can someone explain why an IntStream is slower and why we need it at all although it comesup with perfomance penalty?

import java.util.Random;
import java.util.stream.IntStream;

public class Main {

    private static Random r = new Random();
    private static int sum;
    private static int branchDirection = 500000;
    private final static LIMIT = 1000000;
    
    public static void main(String[] args) {
//      int[] randomValues= getRandomValuesWithIntStream(); // This is slower
        int[] randomValues = getRandomValuesAsUsual(); // Than this
        
        Arrays.sort(randomValues);
        
        long start = System.nanoTime();
        
        for(int i = 0;i<randomValues.length; i++) {
            if (randomValues[i] >branchDirection) {
                sum += randomValues[i];
            }
        }
        
        System.out.println("Elapsed Time: "+ (System.nanoTime()-start));
    }
    
    private static int[] getRandomValuesAsUsual() {
        int[] randomValues = new int[LIMIT];
        for(int i = 0;i<randomValues.length; i++) {
            randomValues[i] = r.nextInt();
        }
        return randomValues;
    }
    private static int[] getRandomValuesWithIntStream() {
        return IntStream.generate(r::nextInt).limit(LIMIT).toArray();
    }

}
Tristate
  • 1,498
  • 2
  • 18
  • 38
  • 1
    How much slower if I may ask? as I am guessing maybe as with a stream you have the overhead of stream object creation, looping to fill it, array creation, then another loop to fill it VS normal way just array creation and one loop – Mina Aug 12 '22 at 14:17
  • 1
    IntStream and its methods are more competitive with big data. My suggestion is if your code or algorith change in the future and the data you are using is getting bigger try IntStream and usual method again to decide which one is more efficient for you. – Tunahan Akdogan Aug 12 '22 at 14:29
  • Why wouldn't it be? Any abstraction has to add at least a little overhead. – Louis Wasserman Aug 12 '22 at 15:09
  • 1
    [System#nanoTime is not a great benchmarking tool](https://stackoverflow.com/questions/8853698/is-system-nanotime-system-nanotime-guaranteed-to-be-0/8854104#8854104), and the benchmark you've created suffers from numerous issues that jmh can alleviate for you (number of runs, jvm warmup, etc). I would say the code you've shown here is not a valid benchmark for what you are attempting to show. – Rogue Aug 12 '22 at 15:28
  • [*How do I write a correct micro-benchmark in Java?*](https://stackoverflow.com/q/504103/642706) – Basil Bourque Aug 12 '22 at 19:19

1 Answers1

2

Using streams with small amounts of input may take longer than conventional code. This fact has been covered extensively on Stack Exchange and in the Java community (articles, blogs, etc).

Streams were invented for convenience, for use in writing code in a functional style. Such code can be more concise and clear than conventional code. And may be less prone to programmer errors.

For large amounts of input, streams can perform roughly equivalent to conventional code. The overhead cost of establishing the stream can be amortized over the larger number of items being processed.

If you engage parallel streams with large amounts of input, you may see much better performance than with conventional non-concurrent code, especially on a multi-core machine.


I would not hesitate to use streams. Any small cost in overhead is likely to be insignifiant in most places of most apps. Beware the trap of premature optimization. Programmers are notoriously poor at intuiting bottlenecks.


FYI, the benchmarking code shown in the Question is severely flawed. See above Comment by Rogue.

Search Stack Overflow to learn about proper benchmarking techniques, such as this Question. And learn to use Java Microbenchmark Harness (JMH). See JEP 230: Microbenchmark Suite.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154