3

Given an array arr, find the maximum abs(i-j) such that abs(arr[i] - arr[j]) <= k.

After a lot of thought, I came up with the following algorithm,

1) Create a new array of pair<arr[i], i> (say arrayIndexPairs)
2) Sort this arrayIndexPairs based on the array value (first of pair).
3) Build a segment tree on the index (second of pair) with the arrayIndexPairs so that we can answer range max queries
4) for i <- 0 to n-1
    4.1) rightIndex = Binary search the array values (first of pair) for ceil(arrayIndexPairs[i].first) in the interval [i+1, n-1]
    4.2) int maxIndex = rangeQueryForMax(i+1, rightIndex)
    4.3) result = max(result, maxIndex - i);
return result

The complexity is O(n log n) for the sort + for every element we do a binary search O(log n) + rangeQuery, O(log n). The overall time complexity is O(nlogn + n*2*logn) which is asymptotically O(nlogn).

Is the approach correct? Is is possible to formulate a linear time solution? I tried using hashmaps but find it very hard to arrive at a linear solution.

A. Sarid
  • 3,916
  • 2
  • 31
  • 56
harun_a
  • 620
  • 1
  • 7
  • 20

3 Answers3

1

I came up with this:

def find_max_abs(l, k):
    for lenght_of_interval in range(len(l), 1, -1):
        for start_of_interval in range(0, len(l) - lenght_of_interval + 1):
            if abs(l[start_of_interval] - l[start_of_interval + lenght_of_interval - 1]) <= k:
                return lenght_of_interval - 1

Should work nicely, but it's not linear(worst case N²). I am interested if a linear algorithm exists

Martin Gottweis
  • 2,721
  • 13
  • 27
  • I'm not sure if this will work. We are looking for two elements in the array that are farthest apart. Sorting would destroy information on the original indexing. Also return l[-1] - l[0] is the max difference, we are looking for the distance! – harun_a Sep 21 '16 at 07:27
1

For the general case, your idea seems efficient.

For the case where the elements are all integers, you can do it in Θ(n k) expected time. If k = o(log(n)), this is a saving. If k is a constant, this is linear in n.

  1. Place all your elements in a hash table mapping each element e to its position in the array i (if there is more than a single e, let each entry you place in the hash table overwrite the previous one - it doesn't matter).

  2. For each element e at position i, and d = -k, -(k - 1), ... 0, 1, ... k, check if e + d is in the hash table. If so, you have the position of e + d, say j, from the hash table.

  3. Retain the positions of the maximal distance you found in 2.

Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • Isn't theta(nk) a pseudo polynomial solution? In that case, O(n^2) works better? Also, if k is a constant but it's huge (of the order of billions), again won't O(n^2) be better? – harun_a Sep 21 '16 at 07:23
  • That's why I specified the condition that *k = o(log(n))* (note the use of the small o - I even linked to the definition). If that's the case, then this will be more efficient by definition. If, say, k=3, then it would start being more efficient from some value of n, and if k=1B, then it would start being more efficient from some larger value of n. I also stated (in the first sentence) that, for the general case, your original solution is the efficient one. – Ami Tavory Sep 21 '16 at 07:28
  • Ah, I misread the k = o(log(n)) part! Thanks! A llinear solution is not possible? – harun_a Sep 21 '16 at 07:29
  • 1
    Saying that a linear solution is not possible - is something that requires proof. I only know that, for the general case, I couldn't find one (that's a much weaker statement). Proofs of negative results are usually more abstract (e.g., the n logn lower bound on comparison based sorting). – Ami Tavory Sep 21 '16 at 07:30
0

It seems a bit forced the definition of "linear". I would approach in a different way. We can notice that the function distance d is searching for the max. So we know that there is the following combinations: - DIST COUPLES COUNT

  • d=n-1 (a[0],a[n-1]) 1

  • d=n-2 (a[0],a[n-2]), (a[1],a[n-1]) 2

  • ...
  • d=1 (a[0],a[1])...(a[n-2],a[n-1]) n-1

Since we search the maximum we are going to investigate first by the maximal distances . So we have best case O(1), worst case the sum from 1 to n-1 =(n-1)*(n/2) =O(n2). Average I would expect better performances, since it is can be implemented very efficently.
Here the C implementation:

#include "stdio.h"
#include "stdlib.h"
#define ARRAYSIZE 10000

int find_dist(int * result, const int *array, int n,int k )
{
  int i,j,ti;
  for (i=n-1;i>0;i--)
  { 
     ti=i;       
     for (j=0;ti< n ; j++,ti++)
        if (abs(array[j]-array[ti])<=abs(k))
            {
                result[0]=j;
                result[1]=ti;
                return 1;
            }
   }
return 0;
}

int main()
{
  int array[ARRAYSIZE],result[2],i;
  for (i=0;i<ARRAYSIZE;i++)
    {
        array[i]=rand()%1000;
        //printf("%d ",array[i]);
    }
  if (find_dist(result,array,ARRAYSIZE,3))
    printf ("\n%d %d\n",result[0],result[1]);
  else
    printf ("No items match requirement\n");
 }
jurhas
  • 613
  • 1
  • 4
  • 12