0

I have completed an iterative implementation of Binary Search in python and was wondering if there is an efficient way to ensure that your input is always sorted before sorting it if needed. How can one validate the input to be sorted? What is the time complexity if the input is not sorted? I would also appreciate comments if this code can be made more efficient.

def binary_search(array, search):
    low = 0
    high = len(array)

    while low <= high and len(array[low:high]):
        mid = (low + high) // 2
        if array[mid] == search:
            return "Found " + str(search) + " at index " + str(mid)
        elif search > array[mid]:
            low = mid + 1
        elif search < array[mid]:
            high = mid - 1

    return str(search) + " not found."


search_through = [2, 3, 4, 10, 40]
to_search = 49
print(binary_search(search_through, to_search))

Output: 49 not found.

Faisal Khan
  • 105
  • 8
  • If `array = [2,3]` and `search = 2`, the output is "2 not found." – user3386109 Aug 11 '22 at 04:50
  • 1
    The `and len(array[low:high])` should not be necessary, and once the code works correctly, it won't be necessary. – user3386109 Aug 11 '22 at 04:55
  • One way to check if an array is **not** sorted is `if any(a[i+1] < a[i] for i in range(len(a)-1)):` – user3386109 Aug 11 '22 at 05:03
  • 2
    1. You ask: "What is the time complexity if the input is not sorted?"--binary search requires the list to be sorted to work. 2. Check out [Pythonic way to check if a list is sorted or not](https://stackoverflow.com/questions/3755136/pythonic-way-to-check-if-a-list-is-sorted-or-not) for myriad ways to see if a list is sorted. 3. The third conditional can be changed to an else i.e. `elif search < array[mid]:` can become `else:` because of the earlier conditionals. – DarrylG Aug 11 '22 at 07:52
  • @DarrylG: right, but the complexity can be established anyway. –  Aug 11 '22 at 07:53
  • @YvesDaoust -- so do you mean what is the complexity of sorting in that case? I just thought everyone would agree that the worst-case complexity of sorting would be O(n*log(n)) (using a "good" sorting algorithm). – DarrylG Aug 11 '22 at 08:01
  • @DarrylG: no, the question is about the complexity of binary search. –  Aug 11 '22 at 08:02
  • @YvesDaoust - I assumed everyone knew the complexity of binary search was O(log(n)) and sorting was O(n*log(n)), so was confused by the question about complexity. – DarrylG Aug 11 '22 at 08:05
  • 1
    @DarrylG: O(Log(N)) on a sorted array is familiar, found in any textbook. But the question is about an unsorted array. Though trivial, this is never discussed. You might very well a priori imagine that the algorithm does not terminate, or degenerates to linear complexity. –  Aug 11 '22 at 08:07
  • @YvesDaoust -- so we have the following steps: 1) check if array is sorted which is O(n), 2) sort if unsorted which is O(n*log(n)), 3) perform a binary search which is O(n). So our worst-case complexity is O(n*log(n)). If you're not guaranteed the array is sorted you're better off just during a linear search which has complexity O(n) (unless you can amortize the sorting cost over many queries). – DarrylG Aug 11 '22 at 08:13
  • @DarrylG: no, they are asking the complexity of `binary_search` on an unsorted array. –  Aug 11 '22 at 08:20
  • @YvesDaoust -- correction to previous comment--meant to say in step 3 complexity of binary search is O(log(n)). – DarrylG Aug 11 '22 at 08:22
  • Perhaps worth mentioning that Python has an efficient binary search module: [`bisect`](https://docs.python.org/3/library/bisect.html). – norok2 Aug 11 '22 at 09:17
  • 1
    If this question is about ensuring the input is sorted before applying the binary search, then forget it: it makes no sense to first ensure it, as that costs more than doing the binary search. If you're going to scan the input for checking it is sorted, you might as well use that opportunity to *find* the searched value during that scan, and then the binary search is not needed anymore. So either way, it is not helpful to check if the input is sorted. It *must* be sorted. If this cannot be guaranteed, forget about binary search. – trincot Aug 11 '22 at 09:47
  • @YvesDaoust: Yes one of the questions is what is the cost of binary search on an unsorted array. – Faisal Khan Aug 11 '22 at 21:46
  • @trincot: You answered what I was thinking also that will be it good to sort input every time if the input cannot be guaranteed to be sorted. Looks like it's better to perform this on an already sorted list like an index. – Faisal Khan Aug 11 '22 at 21:48
  • @trincot: interstingly, you and another commentator seem quite reluctant to give the answer: O(Log(N)). –  Aug 12 '22 at 06:37
  • @Yves, I think the asker asked about complexity with an assumption in mind that somehow binary search can yield correct results if only we add some logic to make sure the input is sorted. This is true of course, but at the same time makes binary search useless. And so the question on complexity hides a misunderstanding that I wanted to address in my answer. That binary search has a complexity of O(logN) has of course already been answered many times before. – trincot Aug 12 '22 at 07:37
  • @trincot: I disagree. The complexity is always discussed with the implicit hypothesis that the array is sorted, so nobody looks at the case of unsorted. As I said earlier, "You might very well a priori imagine that the algorithm does not terminate, or degenerates to linear complexity." Though the answer is trivial, the OP does ask (and confirmed in comments). –  Aug 12 '22 at 07:44
  • Yes, I know you look at the question differently, and I understand where you are coming from. I have a different perspective, since the asker suggests that the implicit hypothesis is not there. – trincot Aug 12 '22 at 07:47
  • 1
    @trincot: OP should settle this. –  Aug 12 '22 at 08:50
  • I see both the discussions and based on some more research, if array is sorted time complexity is O(logn) and if not sorted and required to sort it, then it should be O(nlogn)+O(logn), in that case if we perform linear search it will be O(n) which is obviously better. Correct me if I am wrong. Thanks for all your inputs since I had to gain a bit more understanding on this topic. – Faisal Khan Aug 12 '22 at 19:47

2 Answers2

2

To check "sortedness", you need to compare all pairs of successive elements. You cannot do faster than O(N) because every element counts.

Your function always takes O(Log(N)) time, because it halves the interval until it becomes a singleton. But of course, the returned value can be meaningless.


The function performs a three-way comparison, so in many cases two tests per iteration. Even though comparison for equality can cause early termination, it is unsure that this is faster than a version with two-way comparisons only.

For a O(Log(N)) process, efficiency is not so critical.

1

There are two scenarios to consider. Either:

  • The caller of the function guarantees that the list is sorted. In that case it is not necessary to have it double checked. It is then a documented requirement of your function, and the caller will stick to it. You can do the binary search assuming the list is sorted. This is known to have a time complexity of O(log)

or:

  • It is not 100% guaranteed that the list is sorted. In that case it makes no sense to check that it is sorted and then perform the binary search, because when you do that check, you have to visit every element of the list, which would give you the opportunity to immediately find the searched value in that list. So again, it would not be necessary to check that the list is sorted. Instead you would just use that time to actually find the value with a linear search that does not need the input to be sorted. Here the time complexity is O().

So in either case, it makes no sense to verify that the list is sorted. Doing that is a waste of time, as then you might as well use that iteration to look for the value and forget about the binary search all together. A binary search is only useful when it is guaranteed that the input is sorted. There should be no need to verify that -- it should be trusted that it is the case.

A verification step followed by a binary search would deteriorate the whole efficiency gain that a pure binary search delivers: the O(log) complexity of the binary search would be lost in the O() complexity of the verification part, giving O() as overall time complexity.

trincot
  • 317,000
  • 35
  • 244
  • 286