1

I have the following method that calls itself recursively:

public ArrayList<SpecTreeNode> getLeavesBelow()
   {
      ArrayList<SpecTreeNode> result = new ArrayList<>();
      if (isLeaf())
      {
         result.add(this);
      }

      for (SpecTreeNode stn : chList)
      {
         result.addAll(stn.getLeavesBelow());
      }
      return result;
   }

I'd like to convert the for loop to use parallelStream. I think I'm partly there but not sure how to implement .collect() to 'addAll' to result:

chList.parallelStream()
             .map(SpecTreeNode::getLeavesBelow)
             .collect();

Some assistance would be much appreciated.

Didier L
  • 18,905
  • 10
  • 61
  • 103
Steve W
  • 1,108
  • 3
  • 13
  • 35
  • 1
    You're after recursive streams, aren't you? In that case `flatMap` is probably the way to go. For an example have a look here: http://squirrel.pl/blog/2015/03/04/walking-recursive-data-structures-using-java-8-streams/ – Thomas Jan 31 '18 at 15:57
  • Possible duplicate of [In Java, how do I efficiently and elegantly stream a tree node's descendants?](https://stackoverflow.com/questions/32749148/in-java-how-do-i-efficiently-and-elegantly-stream-a-tree-nodes-descendants) – Didier L Jul 04 '18 at 20:22

2 Answers2

1

Just like this, right? Am I missing something?

result.addAll(
    chList.parallelStream()
         .map(SpecTreeNode::getLeavesBelow)
         .flatMap(Collection::stream)
         .collect(Collectors.toList())
);

Unrelated to your question but because you're seeking performance improvements: you may see some gains by specifying an initial size for your ArrayList to avoid reallocating multiple times.

A LinkedList may be a preferable data structure if you can't anticipate the size, as all you're doing here is continually appending to the end of the list. However, if you need to randomly access elements of this list later then it might not be.

Michael
  • 41,989
  • 11
  • 82
  • 128
  • I don't think you're missing something. I just started with streams today. However, that code gives me compile error: no suitable method found for addAll(List>) method Collection.addAll(Collection extends SpecTreeNode>) is not applicable (argument mismatch; inference variable T has incompatible bounds upper bounds: SpecTreeNode,Object lower bounds: ArrayList)... – Steve W Jan 31 '18 at 16:08
  • @SteveW Ah, you need an additional flat map to go from a `Stream>` to a `Stream`. I've edited my answer. – Michael Jan 31 '18 at 16:14
  • Thanks. That works, buts its way slower than the original for loop :( – Steve W Jan 31 '18 at 16:17
  • @SteveW Yeah, parallel streams are not a silver bullet for improving performance. If you google "parallel stream performance" you'll find plenty of interesting stuff. – Michael Jan 31 '18 at 16:21
  • Yeah, I have a speed critical application, so it was worth a try. I'll leave the question open for a while uncase anybody can suggest something to improve it. If not, I'll mark this as the correct answer. – Steve W Jan 31 '18 at 16:22
  • 2
    Why wrap the stream inside a result.addAll? Aren't you just creating a new list from a already made list? Wouldn't `List result = chList.parallelStream().map(SpecTreeNode::getLeavesBelow).flatMap(Collection::stream).collect(Collectors.toList());` be better? Not sure if it is faster than the original code though. – Asthor Jan 31 '18 at 17:15
  • @Asthor Because of the preceding `result.add(this);` which places the current node at the start of the list. You can work around that by defining a custom collector, though the code is uglier. I've edited my answer to show this. – Michael Jan 31 '18 at 17:46
  • This supplier `()-> result` is wrong, because when the stream is partitioned to be collected in parallel, that supplier (an `ArrayList` containing one element), will be used as a starting point to accumulate elements. This means that the element added at the beginning would be duplicated. It needs to be an empty list and you need to add the element at the beginning later. Besides, using LinkedList is 99.99% of the times a bad idea... – fps Jan 31 '18 at 17:55
  • 3
    A `LinkedList` is very unlikely to help in any way. – Holger Jan 31 '18 at 18:01
  • @FedericoPeraltaSchaffner Hm, you're right. In which case, I can't see a nicer way to do it than by using `addAll`. And sorry but that statement about LinkedList is way off. It's broadly better to prefer ArrayList, sure, but if you're micro-optimising code it's certainly a valid option to try. The only way to *know* is to profile. You can't write it off without even trying it. – Michael Jan 31 '18 at 18:07
  • @FedericoPeraltaSchaffner "*In any situation*". Seriously? [Here you go](https://ideone.com/i42wZS). Paste it into your IDE. LinkedList runs faster. Obviously, I specifically designed it so that it would. – Michael Jan 31 '18 at 18:28
  • You are correct. In "any situation" was a too general and fatalist claim. I've [changed your test](https://ideone.com/pbpSAd) (keeping the same spirit, though) – fps Jan 31 '18 at 20:31
  • @Michael not sure I would have drawn the assumption that the current node has to be at the start of the list. The method in question seems to add only leaf nodes (and only leaf nodes) to the list based on the if statement. Given that it is a leaf node I would assume chList is empty. And by parallelizing the loop the order of result isn't guaranteed anymore. However the question itself seems to lack a few details so one can't be fully sure on what the method actually does (definition and initialization of chList most of all). – Asthor Jan 31 '18 at 20:36
1

I would do it by making the recursive method return a Stream of nodes instead of a List, then filter to keep only the leaves and finally collect to a list:

public List<SpecTreeNode> getLeavesBelow() {
    return nodesBelow(this)
        .parallel()
        .filter(Node::isLeaf)
        .collect(Collectors.toList());
}

private Stream<SpecTreeNode> nodesBelow(SpecTreeNode node) {
    return Stream.concat(
               Stream.of(node), 
               node.chList.stream()
                      .flatMap(this::leavesBelow));
}
fps
  • 33,623
  • 8
  • 55
  • 110