2

I know that limit method can limit the Java Stream length by specified value.

But is there a way to limit the Stream length by the total of int values that are field of items of the stream?

The stream is made from List.

The requirements are:

  • The way needs to be able to execute processes in parallel
  • The way needs to create Stream object.
  • I prefer to use Java8 if possible.

Now I implement it by the following way.

  1. Use for statement and count the total.
  2. And then break the for statement when the total reaches the limit value.

But this way cannot satisfy the requirements.

If limitByTotalIntOf method is given, the ideal way I want is:

class Item { int value; }
List<Item> listOfItems = createListOfItems();

int limitValue = 10000;
listOfItems.parallelStream()
    .map(item -> /* set value */)
    .limitByTotalIntOf(limitValue, item -> item.value)
    .map(item -> /* do something */);

Current way is:

int total = 0;
for(Item item: listOfItems) {
    int v = /* set value */;
    total += v;
    if(total < limitValue) break;
    /* do something */
}

Is there any way to satisfy the requirements?

Naman
  • 27,789
  • 26
  • 218
  • 353
ssskkk
  • 21
  • 2

1 Answers1

1

Assuming that your /* do something */ is an expensive operation that truly benefits from parallel processing, the best solution is to separate the selection of the affected items from their actual processing.

The selection is best done sequentially, similar to the loop you already have:

List<Item> selectedItems = listOfItems;
int total = 0;
for(int ix = 0, num = listOfItems.size(); ix < num; ix++) {
    total += listOfItems.get(ix).getValue();
    if(total > limitValue) {
        selectedItems = listOfItems.subList(0, ix);
        break;
    }
}
selectedItems.parallelStream()
// your actual heavy operation

Note that, given the current implementation, a parallel stream over a sublist is way more efficient than using limit(…) on a parallel stream over the entire list.


The parallel stream processing works by splitting the entire sequence of elements, right in the middle in the best case, followed by splitting the remaining chunks, until there are enough chunks to keep every CPU core busy. Naturally, this strategy isn’t well suited for selecting elements from the beginning of the sequence to an element, with a condition incorporating all affected elements.

But for completeness, it is possible to do that in parallel, though it’s rather unlikely to have a benefit from that. You’d need a really large number of elements and a significant fraction has to be in the selection.

int[] values = selectedItems.parallelStream().mapToInt(Item::getValue).toArray();
Arrays.parallelPrefix(values, Integer::sum);
int index = Arrays.binarySearch(values, limitValue);
index = index < 0? -index-1: index+1;
listOfItems.subList(0, index).parallelStream()
// your actual heavy operation
Holger
  • 285,553
  • 42
  • 434
  • 765
  • It needs heavy calculations to get each value, so I want to parallelize them. – ssskkk Jul 22 '19 at 01:22
  • Important is `/* set value */`. `/* do something */` can be ignored. – ssskkk Jul 22 '19 at 01:28
  • More, `/* set value */` calculates the value and some other objects, and I want to keep them after counting values. – ssskkk Jul 22 '19 at 01:33
  • So `/* set value */` is applied to all elements anyway, regardless of the limit applied to subsequent operations? That’s actually easy to achieve, e.g. in the second example, just insert it between `.parallelStream()` and `.mapToInt(Item::getValue)`… – Holger Jul 22 '19 at 09:06