2

I have currently this code:

AtomicInteger counter = new AtomicInteger(0);
return IntStream.range(0, costs.length)
  .mapToObj(i -> new int[]{costs[i][0]-costs[i][1], i})
  .sorted(Comparator.comparingInt(d -> d[0]))
  .mapToInt(s -> 
    counter.getAndIncrement() < costs.length/2 ? costs[s[1]][0] : costs[s[1]][1]
  )
  .sum();

Where I compute diff of two elements of an array and then sort it and in the end I need to process two halves independently.

Is there any better way to do this than using AtomicInteger as a counter? Is there some method like mapToIntWithIndex that is accessible inside JDK (not in external libraries)? Is there something like zip() in python where I could join indices together with stream? If not is there any plan to add this to next Java releases?

Bojan Vukasovic
  • 2,054
  • 22
  • 43
  • If you just want a `zip`, check out [this post](https://stackoverflow.com/questions/17640754/zipping-streams-using-jdk8-with-lambda-java-util-stream-streams-zip). – Sweeper Jun 03 '20 at 09:37
  • This is not a reliable way to do this. You are relying on elements of the stream being processed in order, and that's not guaranteed. – Andy Turner Jun 03 '20 at 09:40
  • @AndyTurner I thought it is, unless I start using parallel processing? – Bojan Vukasovic Jun 03 '20 at 09:42
  • 1
    No, there is no such guaranty, whether you use sequential or parallel. It’s just an implementation detail when the processing happens in a particular order. A different implementation could determine that `sum()` is a terminal operation that does not depend on the order, hence, the `sorted` step is obsolete and can be elided. – Holger Jun 03 '20 at 10:46
  • @Holger - so you say that we should never use `sorted()` in Java streams (to mitigate potential future implementation differences)? – Bojan Vukasovic Jun 03 '20 at 10:53
  • 1
    No, I’m saying that you should only use `sorted` in a Java stream when the subsequent operations are specified to respect the resulting order (which implies being dependent on the order at all). There is nothing wrong with, e.g. `sorted().toArray()` or `sorted().collect(toList())` or `sorted().collect(toCollection(LinkedHashSet::new))`. Likewise, inserting `map` operations between `sorted` and the terminal operation isn’t wrong, as long as the mapping function does not assume to be evaluated in that order. `sum()` obviously is not an operation depending on the order. – Holger Jun 03 '20 at 10:59
  • @Holger it's just `int[][] costs` – Bojan Vukasovic Jun 03 '20 at 11:01
  • 1
    `int[][] sorted = Arrays.stream(costs).sorted(Comparator.comparingInt(a -> a[0] - a[1])).toArray(int[][]::new); return IntStream.range(0, sorted.length) .map(ix -> sorted[ix][ix < sorted.length/2? 0: 1]) .sum();` – Holger Jun 03 '20 at 11:04

1 Answers1

2

This is not a reliable way to do this. The Streams API makes it clear that functions used in maps should not be stateful.

Stream pipeline results may be nondeterministic or incorrect if the behavioral parameters to the stream operations are stateful.

If you use stateful functions, it may appear to work, but because you aren't using it according to the documentation, the behaviour is technically undefined, and could break in future versions of Java.

Collect to a list, and then process the two halves of the list:

List<int[]> list = /* your stream up to and including the sort */.collect(toList());
int sum = list.subList(0,    half       ).stream().mapToInt(s -> costs[s[1]][0]).sum()
        + list.subList(half, list.size()).stream().mapToInt(s -> costs[s[1]][1]).sum();

Actually, I'd be tempted to write it as for loops, as I just find it easier on the eye:

int sum = 0;
for (int[][] s : list.subList(0, half))           sum += costs[s[1]][0];
for (int[][] s : list.subList(half, list.size())) sum += costs[s[1]][1];
Andy Turner
  • 137,514
  • 11
  • 162
  • 243