1

So, I'm very new to programming and computer science, and have been trying to understand the concept of time complexity while solving my code. However, when it comes to sorting arrays, I always get confused about a few things: In my understanding solving a problem with a sorted array should come in the best case complexity, and unsorted array will have a worst case. What I always get confused about is, how do we actually take advantage of an array being sorted in a problem that involves searching? Meaning, how will this reduce my time complexity, because i thought i will have to run the loop the same number of times.

For example, if i have an array and want to find two indices whose value add up to a specific target, will it make a difference in time complexity if the array is sorted or unsorted?

Thanks in advance for helping me out.

user10817746
  • 13
  • 1
  • 1
  • 3
  • 1
    If the array is sorted, one can, depending on the problem, sometimes use [binary search](https://en.wikipedia.org/wiki/Binary_search) or other algorithms that exploit this structure. For your example, *yes* this makes a difference, since here we can use two "cursors": one that moves forward, and one that moves backward. – Willem Van Onsem Jan 03 '19 at 18:38
  • Perhaps you are thinking of "sorted" too strictly. If your problem is to find two indices whose values add up to 100, then a "sorted" array might be `[40, 60, 1, 200, 9, 83, 17]` – Ben P. Jan 03 '19 at 18:40
  • Answer is not always obvious see legendary: [Why is it faster to process a sorted array than an unsorted array?](https://stackoverflow.com/q/11227809/2521214) but to get back to your question searching on sorted data you do not need to look at all the array elements just on few ... with binary search on `n` numbers only `log2(n)` peaks instead of `n` so no you do not have the same number of iterations .... If you got semi sorted arrays (continuous) data instead even that can be exploited like [How approximation search works](https://stackoverflow.com/q/36163846/2521214) – Spektre Jan 04 '19 at 11:11

2 Answers2

3

Let's take a look at your sample problem: find two numbers whose sum equals a given number.

Let's say you have an unsorted array: [2, 8, 1, 3, 6, 7, 5, 4], and the target is 11.

So you look at the first item, 2, and you know that you have to find the number 9 in the array, if it exists. With the unsorted array, you have to do a linear search to determine if 9 exists in the array.

But if you have a sorted array, [1, 2, 3, 4, 5, 6, 7, 8], you have an advantage. When you see the value 2, you know you need to find 9 in the array. But because the list is sorted, you can use binary search. Instead of having to look at every item in the list, you only have to look at 3 of them:

Binary search will look at the 4th item, then the 6th, then the 8th, and finally determine that 9 isn't in the array.

In short, searching in an unsorted array takes O(n) time: you potentially have to look at every item to find out if what you're looking for is there. A sorted array lets you speed up the search. Instead of having to examine every item, you only have to examine at most log2(n) items. That makes a huge difference when numbers get large. For example, if your list contains a million items, binary search only has to examine 20 of them. Sequential search has to look at all million.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
2

The advantage of a sorted array for the "target sum" problem is even better: you don't search the array at all. Instead, you start with pointers at the two ends. If the sum is equal to the target, emit that and move both pointers in. If less than the target, increment the lower pointer. Otherwise, decrement the upper pointer. This will find all solutions in O(n) time -- after taking O(n log n) for the sort.

For the case given in the comments, [40, 60, 1, 200, 9, 83, 17] the process looks like this:

Sort array:
[1, 9, 17, 40, 60, 83, 200]

Start your pointers at the ends, 1 + 200
The sum is 201, too large, so decrement the right pointer.
Now looking at 1 + 83.  This is too small; increment the left pointer.
Now looking at 9 + 83.  This is too small; increment the left pointer.
Now looking at 17 + 83.  This is the target; print (17, 83) as a solution
    and move *both* pointers.
Now looking at 40 + 60.  This is the target; print (40, 60) as a solution
    and move *both* pointers.
The pointers have now met (and passed), so you're done.

This is a lovely example. In general, sorting the array gives you a variety of options for finding things in the array much faster than checking each element in turn. A simple binary search is O(log n), and there is a variety of ways to tune this for a particular application. At worst, a binary (log base 2) search will work nicely.

However, sorting an arbitrary list costs O(n log n) as overhead; you need to figure this one-time payment into your application's needs. For instance, if your array is some sort of data base sorted by a key value (such as a name or ID number), and you have to perform millions of searches from user requests, then it's almost certainly better to sort the data base somehow before you do any searches.

If you want a thorough introduction, research "sorting and searching". One excellent reference is a book by that title by Donald Knuth; it's Vol 2 of "The Art of Computer Programming".

Prune
  • 76,765
  • 14
  • 60
  • 81