1

I'm converting string array to hashmap using stream.

    String[] arguments= new String[]{"-a","1","-b","2","-c","3","-d","4","-e","5"};
    HashMap<String, String> params = new HashMap<>();
    IntStream.iterate(0, i -> i + 2)
            .limit(arguments.length / 2)
            .parallel()
            .forEach(i -> params.put(arguments[i], arguments[i + 1]));
    System.out.println(params.size());

It shows 5, but sometimes 4.

Could you explain, please, what could be the reason of different results?

Samuel Liew
  • 76,741
  • 107
  • 159
  • 260
Avenir
  • 75
  • 6
  • *"I'm converting string array to hashmap **using stream**"* Why? Use a normal `for` loop. It's faster, simpler, clearer, and it works! – Andreas Jul 17 '17 at 20:32
  • Step 1: learning to solve a problem without `forEach`. Step 2: recognizing that a parallel stream doesn’t pay off for such a trivial task (especially not with the parallel-unfriendly `iterate(…).limit(…)` combo). If you want an efficient solution, you may use `HashMap params = IntStream.range(0, arguments.length/2).map(i -> i*2) .collect(HashMap::new, (m,i) -> m.put(arguments[i], arguments[i+1]), Map::putAll);` that would work in parallel, but there’s still no benefit in parallel processing here. – Holger Jul 18 '17 at 08:24

2 Answers2

2

You are breaking a fundamental stream property - no side-effects. And your stream has a side-effect via forEach. In simpler words you put elements from multiple threads into a non-thread-safe collection HashMap.

The correct way to do it, would be to collect that via:

String[] arguments = new String[] { "-a", "1", "-b", "2", "-c", "3", "-d", "4", "-e", "5" };
    Map<String, String> map = IntStream.iterate(0, i -> i + 2)
            .limit(arguments.length / 2)
            .parallel()
            .boxed()
            .collect(Collectors.toMap(i -> arguments[i], i -> arguments[i + 1]));

    System.out.println(map); // {-a=1, -b=2, -c=3, -d=4, -e=5}
Eugene
  • 117,005
  • 15
  • 201
  • 306
1

When working with parallel streams, the operations must be stateless otherwise due to thread scheduling differences, you'll get non-deterministic results.

This line:

.forEach(i -> params.put(arguments[i], arguments[i + 1]));

is the cause of the different results at each execution.

The solution to your problem is to utilize the collect reduction operation and avoid forEach.

You can find more information about stateless behaviors and Side-effects within the JAVA Stream API.

Ousmane D.
  • 54,915
  • 8
  • 91
  • 126