-3

Here is my thought:

You have N elements to begin with.

But you aren't going through each one though?

Say I have

{1, 2, 3, 4, 5, 6, 7, 8, 9} and I want to find 4.

The first step is looking at 5, we see 4 < arr[5]. Then, we have {1, 2, 3, 4, 5}, the middle is 3, we see 4 > arr[2], thus we are left with {3, 4, 5}.

Now we will get 4.

But that was only 3 steps! I am not understanding why the first search takes N elements, when we are looking at the (N-1)/2th element, which is one step?

EDIT!!!

Here is what I am taught:

search 1: n elements in search space

search 2: n/2 elements in search space

search 3: n/4 elements in search space

... search i: 1 element in search space.

search i has n/(2^[i-1])elements, thus you solve for i then you get

i = log(n) + 1.

What I don't understand:

You have n elements, I agree, but you aren't searching through all of them, you are only searching 1 element, then why do you count all n?

garchyguy
  • 13
  • 1
  • 3
  • in sorted array you don't need to go through every number, you need to check only O(log2 N) numbers to find whether element in array or not – Iłya Bursov Sep 17 '16 at 20:21
  • 2
    I dont understand your last sentence. Can you reformulate/explain it in more detail? –  Sep 17 '16 at 20:21
  • O(.. lg n) is the basis for operations that "divide the problem in half each loop". In this case that is simply the mathematical representation - note that lg in such cases is log to base-2. A binary search (which requires a sorted sequence at the start) works very nicely as an example here in concrete numbers. For example it would take lg(1000), or ~10, loops / item reads to search in 1000 elements. 10 < 1000 and the difference grows as n -> infinity, as described by O. – user2864740 Sep 17 '16 at 20:21
  • @user2864740 "lg" often also represents the base-10 logarithm, so that might be confusing. However it doesnt matter here O(log_a n) = O(log_b n) for all a and b > 1. –  Sep 17 '16 at 20:25
  • @Eichhörnchen Correct; I've found that lg2 makes many CS algorithms easier to work into a 'concrete' numbers. – user2864740 Sep 17 '16 at 20:27
  • Without ordering: you need to look at each element, so you need n steps. When ordered: you can throw away one side of all elements after reading one (because one side is now irrelavant because of the ordering). If you optimize your *to-look-at* positions (look at the middle) you can always throw away half of the remaining elements. This means: n-candidates are possible. Look at the middle one; now the left or right n/2 candidates are irrelevant. Look at the middle one of the remaining n/2 candidates; now you can throw away the other half -> only n/4 candidates. – sascha Sep 17 '16 at 20:27
  • 2
    Possible duplicate of [how to calculate binary search complexity](http://stackoverflow.com/questions/8185079/how-to-calculate-binary-search-complexity) – A. Sarid Sep 17 '16 at 20:38
  • @sascha, yes I know that only n/2 are relevant. But we aren't searching through the n/2 elements. We are not going arr[0] arr[1] arr[2].... in that order are we? Then why is the first search requirement n elements and the second search requiring n/2 elements...? – garchyguy Sep 17 '16 at 20:46
  • @Eichhörnchen, here is the question: We have the array {1, 2, 3, 4}. With the binary search we are not searching through every single index right? Search 1, we are only looking at 1 element (middle). Search 2 we are only looking at 1 element again so shouldn't it be 1 + 1 + 1 + .. + 1 for k searches? Since in each search you only look at 1 element and not all n or all n/2 or n/4 elements? – garchyguy Sep 17 '16 at 20:48
  • Who says it's requiring n/2 elements? Each step is **one-step** (look at one element). The question is: how many steps do you need? With my example you see, that the number of candidates is n, n/2, n/4, n/8, ... the divisors are growing exponentially, which is the inverse of the logarithm. So you need log(n) one-.steps (logn times looking at exactly one element). The basic assumption here: you can look at element x with only one-step (array-based data-structure with index-based lookup; won't work with lists). – sascha Sep 17 '16 at 20:49
  • @garchyguy Your terminology is confusing me. There is only one search, the search which is intended to find element 4 in {1, 2, 3, 4, 5, 6, 7, 8, 9}. But the search has multiple steps. Each step involves looking and comparing one element of the array. The question is how many steps we will need before the search returns. "n" is not the number of searches or steps, it is the size of the array, in the example n=10. The number of steps in your example is m=3. Now the question is how does m compare to n if n becomes very large. The answer is that m will be approximately log(n) up to a factor. –  Sep 17 '16 at 20:55
  • @sascha, that is my question. This the process I believe: You start with n candidates, you take half and left with n/2 candidates. Now here is the issue: If you are not searching one-by-one through all those candidates then the _steps_ for the first search isn't n since you didn't go through all elements. If it was a for loop like for(int i = 1; i < arr.length - 1; i++) then that is n steps. – garchyguy Sep 17 '16 at 20:59
  • You don't take half. You look at one element (which you choose) and can deduce which of the remaining two ranges (the left of the chosen one or the right) does not need to be searched for. Vatine's answer is quite good! It also gives the right keyword for the most important assumption besides the ordering: **random-access**. Read it up and don't get confused by the random in the name. – sascha Sep 17 '16 at 21:02
  • @sascha, I have also edited my post and the process which I was taught. I know you look at one element, but why then are we counting the number of **all the elements** in the search space? – garchyguy Sep 17 '16 at 21:15
  • Because search-space != number of steps. Please take a basic algorithm course. My arguments using the obtained information of the search-space was only to show you, how many O(1) (read the middle element) steps you need in worst-case. The result (number of steps to find element x) is then number_of_steps_reading_middle * O(1). We know how many steps we need. – sascha Sep 17 '16 at 21:16
  • @sascha, I'm sorry for this, I know I am a noob but I am trying to learn. O(g(n)), the time complexity shows how many steps it takes to complete the algorithms right? – garchyguy Sep 17 '16 at 21:17
  • **Asymptotically** in **worst-case**. Yes. – sascha Sep 17 '16 at 21:18
  • @sascha, I think I am starting to get a better understanding at least. We are trying to find the time complexity of a binary search, so I am trying to find the g(n) for O(g(n)) in this case. To complete this algorithms, I need to find how many steps it takes in the worst case. How do I do that? I've read links, but I am seriously unsure and would like a fresh start. – garchyguy Sep 17 '16 at 21:24
  • You are missing the basics of complexity-theory and algorithms. This won't change because of some SO-questions and chat-like discussion. Look for some university-course or some tutorial slides. Even wikipedia offers a lot! (The understanding described in the last comment of yours sounds good) – sascha Sep 17 '16 at 21:27
  • @garchyguy well look at simpler ones like bubble sort, insertion sort, selection sort. Their big O complexity timings don't involve Logarithms. – barlop Sep 18 '16 at 00:06
  • @barlop, yes I am aware of those, but I was just confused on this one. – garchyguy Sep 18 '16 at 06:07

1 Answers1

4

The main reason why binary search (which requires sorted data in a data structure with O(1) random access reads) is O(log N) is that for any given data set, we start by looking at the middle-most element.

If that is large than the element we're looking for, we know that we can ignore anything from that point to the end. If it is smaller, we can ignore anything from the beginning to that element. This means that for every step, we're effectively cutting the size of the remaining in half.

By cutting the problem set in half at every step, we can (relatively) easily see that to get from N elements to a single element is O(log N) steps.

The reason why we're not terminating earlier in your example is that even though it seems as if we're scanning the first elements, the only things we actually do are "get length of array" and "access the middle-most element" (so, we never know if the element we're looking for is contained in the array).

Vatine
  • 20,782
  • 4
  • 54
  • 70
  • Hello, that is the confusion for me, how do we see that from N->1 step is log(N) steps? To get from N to N/2 is 1 step then N/2 to N/4 is 1 ... so how do we calculate it? – garchyguy Sep 18 '16 at 06:23
  • @garchyguy Log is the anti-exponent. If you have an original problem that's X elements large and it takes S steps, and the size of the problem halves every step, then the original problem must have been O(2 ** S) elements large. So if you have an initial problem that's N elements large, finding the solution must be O(log N) steps. – Vatine Sep 18 '16 at 08:33