How to save CPU cycle searching a sorted array for an exact match? Linear search is too slow?

Question

The aim of this exercise is to check the presence of a number in an array.

Specifications: The items are integers arranged in ascending order. The array can contain up to 1 million items. The array is never null. Implement the method boolean Answer.Exists(int[] ints, int k) so that it returns true if k belongs to ints, otherwise the method should return false.

Important note: Try to save CPU cycles if possible.

this is my code

import java.util.*;

 class  Main {

    static boolean exists(int[] ints, int k) {

        boolean flag = false;

        for (int i = 0; i < ints.length; i++) {

            if (ints[i] == k) {
                flag = true;
                return flag;

            } else {
                
                continue;
            }

        }
        return flag;

    }

    public static void main(String []args){

        int[] ints = {-9, 14, 37, 102};
        System.out.println(Main.exists(ints,9)); // true
        System.out.println(Main.exists(ints, 102));
    }

}

but after submitted the code, this is the result

The solution works with a 'small' array

The solution works with an empty array

The solution Doesn't work in a reasonable time with one million items

So, why it is not working if anyone can clarify that

The solution works if k is the first element in the array

The solution doesn't use the J2SE API to perform the binary search

not your problem, but simpler way is `if (ints[i] == k) { return true;}` `else` part not needed. — Scary Wombat, Aug 09 '22 at 01:44
`System.out.println(Main.exists(ints,9)); // true` - how can this be true? — Scary Wombat, Aug 09 '22 at 01:45
What on earth is `import static java.sql.DriverManager.println;` for? — Scary Wombat, Aug 09 '22 at 01:46
`import static java.sql.DriverManager.println;` ??? Use `System.out.println` — Stephen C, Aug 09 '22 at 01:46
I mean, it certainly looks like it'd be a much more efficient program if you write the binary search algorithm and used that. — Louis Wasserman, Aug 09 '22 at 01:47
_"... The items are integers arranged in ascending order. ..."_ - they give you that information because they expect you to use it. — Dawood ibn Kareem, Aug 09 '22 at 01:50
Poor title. Rewrite to summarize your specific technical issue. — Basil Bourque, Aug 09 '22 at 01:52
I rolled back your edit which turned it into a different question, one which invalidated the "use binary search because linear search is slow" answer. And you removed the results output. It sounds like you now have a debugging question about your attempt at Binary Search. That's a new question which you should ask separately. (You can include a *link* to this on in your new question, but make sure it stands on its own and has a [mcve] including details like what actually happens vs. what should happen. e.g. infinite loop on not-found? Wrong answer?) — Peter Cordes, Aug 10 '22 at 03:53

score 3 · Answer 1 · answered Aug 09 '22 at 02:18

Your code uses an inefficient algorithm.

Modern CPUs and OSes are incredibly complicated and do vast amounts of bizarre optimizations. Thus, 'save some CPU cycles' is not something you can meaningfully reason about anymore. So let's objectify that statement into something that is useful:

The exercise wants you to find the algorithmically least complex algorithm for the task.

"Algorithmic complexity" is best thought of as follows: Define some variables. Let's say: The size of the input array. Let's call it N.

Now pick a bunch of numbers for N, run your algorithm a few times averaging the runtime. Then chart this. On the x-axis is N. On the y-axis is how long it took.

In other words, make some lists of size 10 and run the algorithm a bunch of times. Then go for size 20, size 30, size 40, and so on. A chart rolls out. At the beginning, for low N, it'll be all over the place, wildly different numbers. Your CPU is busy doing other things and who knows what - all sorts of esoteric factors (literally what song is playing on your music player, that kind of irrelevant stuff) control how long things take. But eventually, for a large enough N, you'll see a pattern - the line starts coalescing, the algorithmic complexity takes over.

At that point, the line (from there on out - so, looking to the 'right', to larger N) looks like a known graph. It may look like y = x - i.e. a straight line at an angle. Or like y = x^2. Or something complicated like y = x^3 + x!.

That is called the big-O number of your algorithm.

Your algorithm has O(n) performance. In other words, an angled line: If 10,000 items takes 5 milliseconds to process, than 20,000 items will take 10, and 40,000 items will take 20.

There is an O(log n) algorithm available instead. In other words, a line that becomes nearly entirely horizontal over time. If 10,000 items take 5 milliseconds, then 100,000 takes 10 milliseconds, and a million takes 20. Make N large enough and the algorithmically simpler one will trounce the other one, regardless of how many optimizations the algorithmically more complex one has. Because math says it has to be that way.

Hence trivially no amount of OS, JVM, and hardware optimizations could ever make an O(n^2) algorithm be faster than an O(n) one for a large enough N.

So let's figure out this O(log n) algorithm.

Imagine a phone book. I ask you to look up Mr. Smith.

You could open to page 1 and start reading. That's what your algorithm does. On average you have to read through half of the entire phonebook.

Here's another algorithm: Instead, turn to the middle of the book. Check the name. Is that name 'lower' or 'higher' than Smith? If it's lower, tear the top half of the phone book and toss it in the garbage. If it's higher, tear the bottom half off and get rid of that.

Then repeat: Pick the middle of the new (half-sized) phonebook. Keep tearing out half of that book until only one name remains. That's your match.

This is algorithmically less complicated: With a single lookup you eliminate half of the phonebook. Got a million entries in that phonebook? one lookup eliminates HALF of all names it could be.

In 20 lookups, you could get an answer even if the phonebook has 2^20 = 1 million items. Got 21 lookups? Then you can deal with 2 million items.

The crucial piece of information is that your input is sorted. It's like the phone book! You can apply the same algorithm! Pick a start and end, then look at the middle. Is it your answer? great, return true;. Is it not? If lower, then start now becomes the middle, and start the algorithm again. Is it higher? Then end now becomes the middle. Is start and end identical? Then return false.

That's the algorithm they want you to write. It's called "binary search". Wikipedia has a page on it, lots of web tutorials cover the notion. But it's good to know why binary search is faster.

NB: Moore's Law, an observation that computers get faster over time, more or less limits technical improvement at O(n^2). In other words, any algorithm whose algorithmic complexity is more complicated than O(n^2) is utterly safe - no amount of technological advancement will ever mean that an algorithm that cannot reasonably be run today can run in a flash on the hardware of tomorrow. Anything that is less complicated can just wait for technology to become faster. In that sense, anything more complex than O(n^2) will take literally quadrillions of years for large enough N, and always will. That's why we say 'this crypto algorithm is safe' - because it's algorithmically significantly more complicated than O(n^2) so the Apple M99 chip released in 2086 still can't crack anything unless you give it quadrillions of years. Perhaps one day quantum tech leapfrogs this entire notion, but that's a bit too out there for an SO answer :)

*'save some CPU cycles' is not something you can meaningfully reason about anymore.* Agreed with your point that algorithmic complexity is the key factor here, but let's not overstate things. If you're one of the human experts that knows about modern CPUs, e.g. in order to write optimizing compilers and JVM JITs to make code for them, you certainly can save CPU cycles in some code by writing it to compile efficiently. There's probably nothing you could usefully micro-optimize about a linear search in Java, though; it's up to the JVM whether it uses SIMD to search 16 or 32 bytes at once. — Peter Cordes, Aug 09 '22 at 09:12
For a binary search, one relevant consideration in asm is whether you branch on a compare, or use a branchless select to generate the next search location. Branchy lets branch prediction work as a sort of prefetch, but suffers from a mispredict every time it was wrong. Branchless avoids mispredicts, but the next load can't start until the previous load's data arrives. L2 cache hit latency is similar to branch-mispredict penalty on modern x86 CPUs, so for array large enough that they're mostly not hot in at least L2, [branchy is better](https://stackoverflow.com/q/11360831) — Peter Cordes, Aug 09 '22 at 09:16
Also related, if you get to choose your input format, and can write in C or assembly: [What is the most efficient way to implement a BST in such a way the find(value) function is optimized for random values in the tree on x86?](https://stackoverflow.com/q/56521133) - an implicit tree (no pointers, just position in an array) can be searched very fast with SIMD instructions, in chunks to balance throughput and cache-miss latency. An N-ary tree is much shallower than a binary tree for N=33 (chunks of 32 elements, like 2 cache lines of ints). — Peter Cordes, Aug 09 '22 at 09:19
thank you so much for your clarification , i have edited my code above after applying the binary search but still get the same result i cant understand what is wrong there @PeterCordes — Beginner developer, Aug 09 '22 at 22:06
Your code looks fine at casual glance. At some point this boils down to 'the black box that tests my code needs to be opened up to know what is going on'. — rzwitserloot, Aug 09 '22 at 23:33

How to save CPU cycle searching a sorted array for an exact match? Linear search is too slow?

1 Answers1