5

Now I have N different intergers, I need to find an interval that has the most numbers whose value is between the endpoints of the interval in O(NlogN) time. I call it a "divide and conquer" problem because it is in my final exam's "divide and conquer" category. I have been thinking about it for 2 weeks and have done a lot of experiments, none of them are right(compared to a brute force algorithm). Could someone help me?

examples:

8,1,3,4,7. The answer is 1-7.

2,6,5,4,9,8. The answer is 2-9 or 2-8.

I think the word "interval" doesn't express my meaning. I mean to find a subsequence of the array that has the most numbers whose value is between the endpoints of the subsequence. Eg.1: "1,3,4,7" has two numbers(3,4), and eg.2: both "2,6,5,4,9" and "2,6,5,4,9,8" have three numbers(6,5,4).

here is my code(O(n^2)). @Vaughn Cato I use this to compare to your code.

#! /usr/bin/env python
#coding=utf-8
import itertools
def n2(numbers):
  a = [0]*len(numbers)
  ans = -1
  l = 0
  r = 0
  for j in range(1,len(numbers)):
    t = 0
      for i in range(j-1,-1,-1):
        if numbers[i]<numbers[j]:
          x = t - a[i]
          if x>ans:
            ans = x
            l = i
            r = j
          t += 1
        else:
          a[i] += 1
  return (numbers[l],numbers[r],ans)

def countBetween(numbers,left,right):
  cnt = 0
  for i in range(left+1,right):
    if numbers[left]<numbers[i]<numbers[right]:
      cnt += 1
  return cnt

for numbers in itertools.permutations(range(5)):
  ans1=n2(numbers)
  ans2=longestInterval(numbers)
if(ans1[2]!=ans2[2]):
  print ans1,ans2,numbers
Amos
  • 3,238
  • 4
  • 19
  • 41
  • 4
    If you've been at it for 2 weeks you've probably tried something. Feel free to share. – keyser Jan 29 '13 at 10:15
  • Show us your code, an example of the problem with the expected solution and tell us any problem details/restrictions you haven't mentioned. – Alexey Frunze Jan 29 '13 at 10:21
  • 5
    Any constraints on the interval? Otherwise you can output [MIN_INT, MAX_INT] in O(1). – Henrik Jan 29 '13 at 10:21
  • 1
    From my understanding, your problem is to find the min and the max of your integer list. A merge sort (which has O(N.log(N)) complexity) and you're done ... I don't think it's that simple so explain your subject clearly – Rerito Jan 29 '13 at 10:23
  • Sorry but I have simplified this problem so my code is not explicit. I hope the examples explain it clearly. – Amos Jan 29 '13 at 10:32
  • add some explanations for first and second examples. – ivan.mylyanyk Jan 29 '13 at 10:39
  • @henrik, probably, you mean in O(n), isn't it? At least, you need to look up over all elements of array. – ivan.mylyanyk Jan 29 '13 at 10:40
  • @cupidon4uk no, before the edit it was possible to interpret the question in way that you don't need to look at the elements at all, simply output the maximum representable interval. But now the question is clarified somewhat by the examples. – Henrik Jan 29 '13 at 10:50
  • did someone get examples? If so, share it with others, because I cannot get it at all. – ivan.mylyanyk Jan 29 '13 at 11:10
  • oh, now I got it. Heh, indeed, this is very interesting problem. – ivan.mylyanyk Jan 29 '13 at 11:56
  • actually this isn't so bad because it doesn't require shuffling of #s. all you have to do is adjust the end points of the subseq. hint: say you divide the list into two and you have the right answer for the left and the right answer for the right.. how would you merge them? this is *the* key. – thang Jan 29 '13 at 12:15
  • @thang I have spent much time in finding such a merging method. However it seems impossible. It's not like an RMQ problem. – Amos Jan 29 '13 at 12:51
  • @amos, what have you tried? suppose i have an array a[1..N] (wlog, assume N even), and i know that the solution for a[1...N/2] is i,j (i – thang Jan 29 '13 at 13:04
  • @thang, interesting starting point. Nevertheless, the merge of the case where j = k-1 is the key to the problem (and seems particularly tricky to me). – Rerito Jan 29 '13 at 13:38
  • What about an array like `5666667 1229223`? (I know, the question is about different integers, so just think 50, 61, 62, 63, etc.) In the first half, the best solution is 5-7, and in the second half, the best solution is 1-3, but overall, the best solution would be 5-9. Taking just the solutions of the subproblems, i.e. 5-7 and 1-3, how could you merge them to 5-9? – tobias_k Jan 30 '13 at 08:44
  • @tobias_k I don't think we could merge the two subproblems only using this information. I was trying to keep an array as(aftersmaller):as[i] means there are as[i] numbers after i that are smaller than i. Similarly we can construct the other 3 arrays:ab,bs,bb. But I still couldn't find a way to use them effectively. – Amos Jan 30 '13 at 09:40
  • This problem has been solved at http://stackoverflow.com/questions/22952096/how-to-find-longest-constrained-subsequence by David Eisenstat. – Amos Apr 18 '14 at 12:03

2 Answers2

1

NOTE: This doesn't actually work, but it might give you some ideas.

Think of it this way:

  • Let X be the array of numbers.
  • Let s be the index of the start of the subsequence.
  • Let e be the index of the end of the subsequence.

If you pick an arbitrary partition index p, then the longest subsequence either goes across this partition or it falls to the left or right of that partition. If the longest subsequence goes across this partition, then s < p <= e. To find s, find the index with the most numbers between s and p which are greater than X[s]. To find 'e', find the index with the most numbers between p and e which are less than X[e].

You can recursively check the left and right sides to see if you can find a longer subsequence.

Finding which index has the most greater numbers to the right or the most less than numbers to the left can be done in linear time if you have the indices of X sorted by value:

To find the start index, begin with the first index of your sorted list of indices and say it is the best so far. If the next index is greater than the best so far, then any future index will need to be even farther to the left than our current best to be the new best, so we subtract one from our best index (but remember what the best index really was). If the next index is to the left of our best index, then make it be the best index. Keep repeating this process, for each of the indices in order.

You can do a similar procedure to find the best index for the end on the right side.

The only remaining trick is to maintain the sorted list of indices for whatever range we are working on. This can be done by sorting the entire set of numbers initially and finding their indices, then at each level of the recursion we can split the sorted indices into two sublists in linear time.

Here is a python implementation of the idea:

# Find the index from the given indices that has the most numbers to the
# right of it which are greater in value.  The indices are sorted by
# the value of the numbers at that index. We don't even need to know
# what the numbers are.
def longestLowerSequence(indices):
  best_index=indices[0]
  target_index=best_index
  for i in range(0,len(indices)):
    if indices[i]<target_index:
      best_index=indices[i]
      target_index=best_index
    else:
      target_index-=1
  return best_index

# Find the index from the given indices that has the most numbers to the
# left of it which are less in value.
def longestUpperSequence(indices):
  n=len(indices)
  best_index=indices[n-1]
  target_index=best_index
  for i in range(0,n):
    if indices[n-1-i]>target_index:
      best_index=indices[n-1-i]
      target_index=best_index
    else:
      target_index+=1
  return best_index

# Return the pair of indices which has the most values between it.
def longestRangeFromSortedIndices(numbers,indices,begin,end):
  assert end>begin
  if end-begin<=2:
    return (indices[begin],indices[end-1])
  assert type(indices) is list
  partition=(begin+end)/2
  left_indices=filter(lambda index: index<partition,indices)
  right_indices=filter(lambda index: index>=partition,indices)
  assert len(left_indices)>0
  assert len(right_indices)>0
  left=longestLowerSequence(left_indices)
  right=longestUpperSequence(right_indices)
  left_range=longestRangeFromSortedIndices(numbers,indices,begin,partition)
  right_range=longestRangeFromSortedIndices(numbers,indices,partition,end)
  best_size=countBetween(numbers,left,right)
  best_range=(left,right)
  left_size=countBetween(numbers,left_range[0],left_range[1])
  right_size=countBetween(numbers,right_range[0],right_range[1])
  if left_size>best_size:
    best_size=left_size
    best_range=left_range
  if right_size>best_size:
    best_size=right_size
    best_range=right_range
  return best_range

def sortedIndices(numbers):
  return sorted(range(len(numbers)),key=lambda i: numbers[i])

def longestInterval(numbers):
  indices=sortedIndices(numbers)
  longest_range=longestRangeFromSortedIndices(numbers,indices,0,len(numbers))
  return (numbers[longest_range[0]],numbers[longest_range[1]])
Vaughn Cato
  • 63,448
  • 5
  • 82
  • 132
  • Thanks! But I think this method may be wrong. I have compared your code to a brute force code and found this:(1, 3, 4, 0, 2) You code gives (1,2) and the ans is (1,4). – Amos Jan 31 '13 at 07:47
  • @amos: I agree. It makes the assumption that you can find the start and end indexes independently on each side of the partition, but they are actually interrelated. – Vaughn Cato Jan 31 '13 at 14:14
0

I believe this is a variant of the maximum subarray problem.

It can be solved using divide and conquer as follows:

  1. Divide the integer array into equal halves

  2. Compute the results R1, R2 on both halves respectively(R1, R2 are lengths of the maximum intervals for each half, and the start and end points are stored as well)

  3. Obtain the minimum integer MIN from the first half and the maximum integer MAX from the second half and compute result R3 as the distance from MIN to MAX in the original array (Min and Max are the start and end point respectively)

  4. Return the largest of R1, R2 and R3 as the result of the entire problem

Why this works:

The largest interval comes from one of the three cases: 1) the first half 2) the second half 3) across the two halves. Thus, computing the largest of the three yields the optimal result.

Time complexity:

Solving the recurrence:

T(n) = 2T(n/2) + O(n)

gives T(n) = O(nlogn). Note: as the recurrence indicates, we solve two subproblems of half size(2T(n/2))and find the minimum and maximum integers in two halves in linear time(O(n)).

Terry Li
  • 16,870
  • 30
  • 89
  • 134
  • Thanks! But I wonder if you misunderstand the problem. “Obtain the minimum integer MIN from the first half and the maximum integer MAX from the second half and compute result R3 as the distance from MIN to MAX in the original array (Min and Max are the start and end point respectively)” is not right. – Amos Feb 01 '13 at 05:47
  • @amos I guess I ignored the fact that an integer in the interval might not be in the range. I'll leave the answer as it is in case someone should get inspiration from it. I'll continue working on it. – Terry Li Feb 01 '13 at 14:34