8

This question arose from the discussion in the comments of this answer.

First, let's say it's quite difficult to define what out-of-order is. Taking the example Pavel Shved gave, in the list [1,5,10,2,3,4,5,6,7,8,9,10,11] we can "clearly" see that 5 and 10 (indices 1 and 2) are out of order. But a naïve algorithm that simply checks some kind of sorted list invariant would not point those out.

  • checking a[i-1]<=a[i] for all 0<i<=N would yield the element at index 3 (which is 2);

  • checking a[j]<=a[i] for all 0<=i<=N and 0<=j<=i would yield all elements in indices 3 to 12;

My question is: can you think of an algorithm to solve this problem that yields the "correct answer" (i.e. indices 1 and 2)? If so, under what time and memory complexity would it run?

Community
  • 1
  • 1
R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510

3 Answers3

11

Probably the best approach to this would be to first find the longest increasing subsequence and then consider all elements not part of that sequence to be out of order. The algorithm provided on the linked page runs in O(n log n) time and uses O(n) space (in addition to that of the list itself).

Such an approach would definitely yield the correct answer for your example case, since the longest increasing subsequence would be the 1 through 11 sequence not including the extra 5 and 10.

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
Amber
  • 507,862
  • 82
  • 626
  • 550
  • That works in my first example but look at this one: [1, 10, 2, 3, 4, 6, 5]. 10 and 6 are out of order, but 1 and 5 not... The more I think of it, the more my question seems a moot point, because I can't quite define what out of order is, as Botz3000 points out.. – R. Martinho Fernandes Sep 07 '09 at 21:21
  • Oh, wait! I didn't check the link because I thought I knew what longest increasing subsequence meant! Then I read this "This subsequence is not necessarily contiguous". Now, I think that may be the solution. – R. Martinho Fernandes Sep 07 '09 at 21:23
  • 1
    Note that with [1, 2, 3, 4, 6, 5] what "out of order" means is ambiguous - the 6 could be out of order, OR the 5 might be considered out of order, since moving either one a single place to the left or right would put that portion of the list "into order". In other words, you could say "the 5 is too far right" or "the 6 is too far left" and both would be essentially equivalent in terms of "degree of disorder". – Amber Sep 07 '09 at 21:27
  • @Dav, you are correct. As pointed out in the wikipedia link provided, the longest increasing subsequence (let's call it LIS for short) is not unique. But the for original goal of this algorithm (see the link in the question) pointing any one of them is enough. – R. Martinho Fernandes Sep 07 '09 at 21:31
  • it's worth noting that this can be done in O(N log K) where K is the number of "out-of-order" elements – yairchu Sep 07 '09 at 21:40
  • I'm taking this as the accepted answer because it seems to define (and solve) my blurred and "humanized" definition of out-of-order as best as possible. – R. Martinho Fernandes Sep 07 '09 at 21:42
1

How should the algorithm know which elements you consider out of order?

If the rule is "list[i+1] should always be list[i] + 1", then it would be easy, of course, just memorizing that after 1, the next element should be 2, select those in between and so on. But you need a precise rule for the algorithm to determine which elements are to be considered "out-of-order".

Botz3000
  • 39,020
  • 8
  • 103
  • 127
0

As Dav said, longest increasing subsequence is probably the best you can do.

this is off the top of my head, so it may (read probably isn't :P) correct: The obvious lower bound for this problem is O(n) since you at least have to read each number once. But suppose there was an algorithm that ran in O(n) time. Then you can simply insert the out of order elements in order in linear time, but the best comparison sort algorithm have a lower bound of O(nLogn) so it's a contradiction. (otherwise, there are non-comparison sort methods like bucket or radix sort that either use a lot of memory or exploit the size of the numbers being sorted...)

Charles Ma
  • 47,141
  • 22
  • 87
  • 101
  • Correct. The factor of N is required to process each element in the list (a necessity), and the factor of log N is required to determine where that element belongs. Thus any algorithm that guarantees a sorted result will have a lower bound of O(N log N). – Amber Sep 07 '09 at 21:23