0

i want to share with you my impl for binary search algo

my idea is to check not just the mid element but first & last elements every time before we check the mid one, so if any of first or last have the value needed there is no need to loop and divide searching for mid anymore. i think this way take less execution time, is it true or not? and why (if not) ?

#include<bits/stdc++.h>
using namespace std;
main()
{   

    vector<int> v(100000); int n;
    sort(v.begin(),v.begin()+n);

    int first =0; int last= n-1;
    int mid; int idx=-1;
    int x; cin>>x;
    while(first<=last)
    {   
        if(x==v[last] || x== v[first])
        {   
            idx=x==v[last]?last:first;
            break;
        }
        else
        {   
            mid= first+(last-first)/2;

            if(v[mid]<x) first=mid+1;
            else if (v[mid]>x) last=mid-1;
            else if (v[mid]==x)
            {   
                idx=mid;
                break;

            }
        }

    }

    if(idx ==-1) cout<<"NOT FOUND";
    else cout<< idx;
}

it will return the last occurrence of the element if the element is found in the first or last position

user4581301
  • 33,082
  • 7
  • 33
  • 54
  • 1
    Memory fetches are potentially expensive. There's more to performance than big-O analysis. If you think this is better, test it on some very large datasets of different types: random, sorted, sorted *backwards*, etc., measure execution time, and see what sort of results you get. – 3Dave Sep 22 '21 at 21:06
  • 1
    Depending on your compiler, even `v[mid]` could be a bottleneck. Assign `v[mid]` to a temp variable and use that for your `if`s. (Your version also isn't particularly thread-friendly.) – 3Dave Sep 22 '21 at 21:08
  • Sorry, I don't get it. Can you explain? – STEV Sep 22 '21 at 21:15
  • If you're looking for people to go over your code and point out improvements, you want [Code Review](https://codereview.stackexchange.com/help/asking). Yes, I deliberately linked to their help pages on asking questions because you'll want to read them before posting there. Trust me. If you're just dropping this here for other purposes, know that Stack Overflow is a Q&A site. If you're not asking a question, you better be answering. In your case it looks like you're interested in speed. Your question would be helped greatly by adding the result of a few profiling runs. – user4581301 Sep 22 '21 at 21:27
  • Now for some tactical notes: `#include` [loads the metaphorical gun](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h) and [using namespace std;](https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice) takes the safety off. This makes it very easy for you to accidentally shoot yourself in the foot. `main` returns `int` No exceptions. Most C++ compilers will outright refuse to compile `main()`. – user4581301 Sep 22 '21 at 21:31
  • All what i need to know is this impl is more optimized and fast one or not. – STEV Sep 22 '21 at 21:31
  • Profile it. You know nothing without hard, real-world numbers. Some of the top sorting algorithms don't look all that impressive in O notation; they take advantage of how things really work to eke out their place among the greats. – user4581301 Sep 22 '21 at 21:36
  • Any "tweaking" of algorithms is likely to only provide benefit if you avoid CPU cache misses - which is the feature that allows less efficient algorithms to work faster in a limited set of scenarios. ... That said, with this approach, you may introduce a deficit in that when the range is small, you are likely to read the same value twice (mid in first run may be first in second), plus, reading first, last then mid will cause multiple cache misses when the range is large. So I doubt that this approach would have any benefit; if anything, it's going to lose performance. – Den-Jason Sep 22 '21 at 21:39
  • Consider the number of comparisons you'll need if the element you're looking for is the _second_ element in the array. Compare this with the number needed for the standard implementation of a binary search (without the unnecessary `if (v[mid]==x)` check). – 1201ProgramAlarm Sep 22 '21 at 21:43
  • One method I imagine may provide real-world acceleration of the search would be if the vector is aligned to the CPU cache line length, and on every pass of the loop, check the value that would be at the beginning of the cache line for `mid` as well as that at the end. So a bit like the "old days" where you load a sector of data from a floppy disk and look at more than one value within it - because in that scenario it makes no sense to load a sector more than once if you don't have to. – Den-Jason Sep 22 '21 at 21:55
  • What is really matter? Number of comparisons or the number of iteration, and why if (v[mid]==x) unnecessary? – STEV Sep 22 '21 at 22:13
  • When a question is of the form "is A faster, or B faster?" I think the best way to figure it out is to *profile* each one and see which one is faster. Optimized builds, of course; profiling an unoptimized build does not provide useful data. – Eljay Sep 22 '21 at 22:36
  • 1
    Note that you re-check one of `v[last]` and `v[first]` on every trip but the 1st. – greybeard Sep 23 '21 at 10:27
  • Have you compared it with one of the standard library’s binary searches such as `std::lower_bound`? – Ben Sep 24 '21 at 11:42

0 Answers0