Why is the binarySearch() method in java.util.Arrays implemented using a loop?

Question

Why is the binarySearch() method in java.util.Arrays implemented using a loop and not by using recursion?

the loop doesn't have to deal with such call stack *garbage*. — aran, Jun 12 '19 at 14:50
@khelwood Because recursion is many cases is more readable. If the compiler can optimize recursion that it is as performant as loops (for example can do [tail-call optimization](https://stackoverflow.com/questions/310974/what-is-tail-call-optimization)) then recursion is a preferable way to implement many algorithms. So quick response to question: because Java compiler doesn't do tail-rec optimization. — Krzysztof Atłasik, Jun 12 '19 at 14:51
How is this question too-broad? What happened to this site? It is because questions like this ware answered in great detail by many expert Stack Overflow became popular. Now it looks like it became "debug my code" service and anything outside of it becomes off-topic/too-broad. — Pshemo, Jun 12 '19 at 15:04
@Pshemo I guess the problem might rather be it quickly turns "opinionated". — GhostCat, Jun 12 '19 at 15:11

Krzysztof Atłasik · Accepted Answer · 2019-06-12T17:04:34.050

8

tl;dr: Because javac doesn't do tail call optimization.

As you probably noticed many algorithms are recursive by its nature (binary search is a great example). So why recursion is not widely used in Java?

The first reason is that recursion is not as performant as plain loops. Second, a more important reason is that recursion in Java is not stack-safe. If your recursion will go to deep into the call stack it might cause the StackOverflowError. It is problematic since you can't really predict how deep you will go because it depends on the data you're processing. A consequence of this is that your recursive function may work correctly on smaller sets of data, but will blow up the stack on bigger sets.

Fortunately, every recursive function can be transformed into an iterative function by replacing recursive calls with iterative control constructs and therefore become stack-safe and this is exactly what you see in java.util.Arrays.

There's also the downside that iterative versions of algorithms might be considered to be harder to reason about and less readable (but that's debatable).

Recursion is also a way to do "loops" without mutable state (loop counter) so this way you can do purely functional iterations (for example Haskell don't even have loops, it depends only on recursion).

So can we have stack-safe, readable and performant recursion? Yes, if the compiler can optimize recursion to code very similar to loop. Most of JVM languages compilers (Groovy, Scala, Kotlin, Clojure and probably more) can do tail-call optimalization. That means is recursive call is last thing function does it will be compiled to bytecode, which looks very similar to the plain while loop.

Actually, there is proposal to add tail-call optimization to javac, but it is yet not implemented. If you want to use stack-safe recursion now, you will probably need to use some other JVM language.

edited Jun 12 '19 at 17:04

answered Jun 12 '19 at 15:11

Krzysztof Atłasik

21,985
6
54
76

@Axel `iterative versions of algorithms are usually harder to reason about and less readable` this assertion for me – baudsp Jun 12 '19 at 15:26
Tl Dr should be put at the beginning imo – Yassin Hajaj Jun 12 '19 at 15:30
@baudsp I changed that paragraph. – Krzysztof Atłasik Jun 12 '19 at 15:31
`It is problematic since you can't really predict how deep you will go because it depends on the data you're processing.` Not only that. You also can't predict how deep you could already be in the stack when your recursive function is called in the first level of recursion. – Federico klez Culloca Jun 12 '19 at 15:31
@KrzysztofAtłasik I'm not sure this change is better. I'd agree that for that algorithm a recursive solution would be more readable, but I wouldn't generalize that with the `many times harder` qualifier. – baudsp Jun 12 '19 at 15:34
@baudsp Changed to "might be considered to be harder to reason about and less readable" – Krzysztof Atłasik Jun 12 '19 at 15:42
you didn't have to, we could just have disagreed on that point :). Also a good answer – baudsp Jun 12 '19 at 16:24
1

@baudsp the `binarySearch()` implementation we are talking about is the one in the standard library. That one is not meant for educational purposes, but to be as efficient both memory and performance wise as possible. I am very sure that considering this, the iterative version was chosen for this implementation and this is a 100% plausible and good answer to this question. Whether or not one should chose the iterative over the recursive approach in other circumstances where other priorities might be to consider, is another question. – Axel Jun 12 '19 at 17:47
@Axel I don't see what you you say have anything about what I said. I never said anything about the implementation of `binarySearch()`, I commented on the generalizations made (in the first versions of the answer) on iterative vs recursive. – baudsp Jun 13 '19 at 07:55
2

`binarySearch` from the stdlib was carefully crafted to as performant as possible, but the point is, that in order for it to be so effective it had to be implemented in loops. If recursion was compiling to as performant code as loops in Java (for example by applying optimization like tailrec), then implementers would probably choose a recursive approach (but of course we can't say that for sure). – Krzysztof Atłasik Jun 13 '19 at 08:04

Why is the binarySearch() method in java.util.Arrays implemented using a loop?

1 Answers1