1

A matrix which is sorted by rows as well as columns. We need to find the Kth smallest element from the given matrix.

Have a solution with time complexity O(KlogM) but need a linear time algorithm to the problem.

Example matrix:

1   3    5
2   4    6
7   8    9

Suppose M=no of rows, N=no of columns and M<N.

My solution:

  1. Create heap of size M by taking all the matrix[i][0] elements.
  2. Find the smallest element and push the next element from the corresponding row.
  3. Repeat the 2nd step till you find Kth smallest.
Trying
  • 14,004
  • 9
  • 70
  • 110
  • 1
    I'm not too sure what is meant by "sorted by rows as well as columns". Surely if it is already sorted finding the k'th smallest element is just a constant time lookup... – SamYonnou Sep 26 '13 at 12:47
  • @mbeckish I told that i have a solution with time complexity O(KlogM). but i need linear time algorithm, so its not duplicate. – Trying Sep 26 '13 at 12:49
  • 1
    If you just iterate over all the elements in the matrix you'll have a linear algorithm with respect to number of elements in the matrix – Simon Sep 26 '13 at 12:49
  • Please explain why you "need a linear time algorithm". Is it because this is a school assignment, and your teacher told you that a linear time algorithm exists? Or is it because you need to solve a real problem and your current O(KlogM) is not fast enough? – mbeckish Sep 26 '13 at 12:49
  • @mbeckish no, it's asked in an interview to me. – Trying Sep 26 '13 at 12:53
  • 3
    Ok, then [here](http://stackoverflow.com/q/5000836/21727) is your duplicate. – mbeckish Sep 26 '13 at 12:57
  • @mbeckish would you like to explain the algorithm which the selected answer is talking about in the given link. I am not able to find in the website. And it will also help other peoples who visit this question as a reference for the algo. Thanks in advance. – Trying Sep 26 '13 at 13:01
  • @Trying - No thanks. But now that you know there is a question with a reference to the correct answer, maybe you can google for "Greg N. Frederickson and Donald B. Johnson. Generalized Selection and Ranking: Sorted Matrices. ", figure it out, and add the description to the original post. – mbeckish Sep 26 '13 at 13:02
  • 1
    Not exactly a duplicate, but you can find answer here: ["How to find pair with kth largest sum?"](http://stackoverflow.com/q/18557175/1009831). – Evgeny Kluev Sep 26 '13 at 13:51
  • Again, what do you mean by "a linear time algorithm". `O(KlogM)` is linear in `K`, and better than linear in `M`. – Teepeemm Sep 26 '13 at 16:39
  • @Teepeemm - k can range from 1 to M*N, so worst case O(KlogM) is O(MNlogM). – mbeckish Sep 26 '13 at 20:27

2 Answers2

4

The Frederickson and Johnson algorithm works roughly as follows. (It always reminds me of Quickselect.) This is from an implementation that used a different implementation as a template, so the details might be a little different than in their papers, but it is the same time complexity and the same idea. I'm assuming the legal values for k run from 0 to M * N - 1.

We will access the element in the ith row and jth column as A[i, j] (counting from 0). In addition, we will pretend we can access A[i, -1] which is negative infinity and A[i, N] which is positive infinity. We maintain two arrays of indices left[0 .. M) and right[0..M), and two variables lessthanleft and greaterthanright, with the following properties:

  • There exists a matrix element A[i0, j0] so that for all rows i, we have A[i, left[i] - 1] < A[i0, j0] <= A[i, left[i]]. That is, left[i] is roughly the position where A[i0, j0] would be if it were in row i.
  • The sum of the values left[i], for i from 0 to M - 1, is lessthanleft. Note that lessthanleft is the number of matrix elements in the positions to the left of the position indicated by left -- that is, matrix elements that are strictly less than A[i0, j0].
  • lessthanleft <= k. So this says that the element we're looking for is in some row i to the right of left[i].
  • There exists a matrix element A[i1, j1] so that for all rows i, we have A[i, right[i] - 1] < A[i1, j1] <= A[i, right[i]]. That is, right[i] is roughly the position where A[i1, j1] would be if it were in row i.
  • The sum of the values N - right[i], for i from 0 to M - 1, is greaterthanright. Note that greaterthanright is the number of matrix elements in the positions to the right of the position indicated by right -- that is, matrix elements that are strictly greater than A[i1, j1].
  • greaterthanright <= M * N - k. So this says that the element we're looking for is in some row i to the left of right[i].

These properties can be initialized by setting every element of left to 0, every element of right to N, and both lessthanleft and greaterthanright to 0 -- unless k = 0 (in which case we return A[0, 0]) or k = M*N - 1 (in which case we return A[M-1, N-1]). This corresponds, by the way, to i0=0, j0=0, i1=M-1, j1=N-1.

Now in every iteration, we pick a pivot matrix element A[i2, j2] with left[i2] <= j2 < right[i2]. (There are several strategies; I'll talk about this below.) We use arrays less[0..M) and lessequal[0..M) which we will fill in the following inner loop, and set variables nless and nequal and ngreater to 0. In the inner loop we iterate over all rows i, and set less[i] and lessequal[i] so that A[i, less[i] - 1] < A[i2, j2] <= A[i, less[i]] and A[i, lessequal[i] - 1] <= A[i2, j2] < A[i, lessequal[i]]. (Since less[i-1] >= less[i], you can use the values of less[i-1] and lessequal[i-1] and start linear search to the left from there to make this take O(N) time overall.) After each such step we add less[i] - left[i] to nless, we add lessequal[i] - less[i] to nequal, and we add right[i] - lessequal[i] to ngreater.

After this inner loop, we check to see if lessthanleft + nless >= k. If this is the case, we continue with the entries less than A[i2, j2] by setting right to be less (by pointer flipping if you want to prevent allocation in the loop for the arrays, or by copying values from less to right) and greaterthanright to be greaterthanright + ngreater + nequal, and continuing with the next iteration. If lessthanleft + nless + nequal < k, then we continue with the entries greater than A[i2, j2] by setting left to be lessequal and lessthanleft to be lessthanleft + nless + nequal, and continuing with the next iteration. Otherwise, the entry we're looking for is among the entries equal to A[i2, j2], so we return A[i2, j2].

Now about picking the pivot. One way would be to pick a random number c in the interval [0 .. N * M - lessthanleft - greaterthanright); we then find the row containing the cth matrix element between left and right, by subtracting left[i] - right[i] from c for each i starting from 0 until it becomes negative. Now select the i where it becomes negative and let j = round((left[i] + right[i])/2). Alternatively you could do a median-of-medians style computation on the medians of the remaining entries in each row.

So generally, the set of matrix elements that is in between left[i] and right[i] on each row gets split, hopefully roughly in half on every iteration. The complexity analysis is similar to the Quickselect one, where the expected number of iterations is "almost certainly" (in the mathematical sense) logarithmic in the number of values in the initial pool. By using a median-of-medians style pivot choice, I believe you could achieve this with certainty while paying for it in practical runtime, but I'll assume for now that "almost certainly" is good enough. (This is the same condition under which Quicksort is O(N lg(N)).) Any individual iteration takes O(M) time to pick a pivot, then O(N) time in the inner loop. The initial pool of candidates has M*N members, so the number of iterations is almost certainly O(lg(M * N)) = O(lg(M) + lg(M)), for a total complexity of O(N * (lg(M) + lg(N))). The algorithm that uses a heap with an entry for every row takes O(k * lg(M)), which is much worse since k is only bounded by N*M.


Here's a simple example: consider M=4, N=5, and we need to pick the k=11th element out of the matrix A given as follows:

 6 12 17 17 25
 9 15 17 19 30
16 17 23 29 32
23 29 35 35 39

We initialize left to [0,0,0,0], right to [5,5,5,5], lessthanleft and greaterthanright both to 0.

Suppose for the first iteration we pick A[1, 2]=17 as the pivot. During the inner loop we start examining the first row from entry right[0]-1=4 to the left, finding the first occurrence of a number that is at most 17. This is in column 3, so we set lessequal[0]=3+1=4. We now continue searching for the first occurrence of a number that is strictly less than 17; this occurs in column 1 so we set less[0]=1+1=2. For the next row, we can start looking for a value that is at most 17 at column lessequal[0], and we find it in column 2 so set lessequal[1]=2+1=3. Looking for a value that is strictly less than 17 starting from less[0], we set less[1] = 2. Continuing we get less = [2,2,1,0] and lessequal = [4,3,2,0]. Hence nless = 5, nequal = 4, and ngreater = 11. We have lessthanleft + nless + nequal = 9 < 11, so we continue with the entries greater than 17 and set left = lessequal = [4,3,2,0] and lessthanleft = 9.

For the next iteration, we need to pick a pivot on some row i in the centre between left[i] and right[i]. Maybe we pick row 2, which means we have A[2,3] = 29. During the inner loop, we now get less = [5,4,3,1] and lessequal = [5,4,4,2] with nless = 4 and nequal = 2 and ngreater = 5. Now lessthanleft + nless = 13 > 11, so we continue with the entries less than 29 and set right = less = [5,4,3,1] and greaterthanright = 7.

For the third iteration, every row now has only one entry left. We pick a row at random - maybe row 3. The pivot is A[3,0] = 23. During the inner loop, we get less = [4,4,2,0] and lessequal = [4,4,3,1]. Hence nless = 1, nequal = 2, and ngreater = 1. We now have lessthanleft + nless = 10 < k = 11 <= lessthanleft + nless + nequal = 12, so we return 23. Indeed, there are 10 matrix entries strictly less than 23 (those to the left of A[i, less[i]] in the last iteration) and 12 matrix entries that are at most 23 (those to the left of A[i, lessequal[i]] in the last iteration).

Erik P.
  • 1,577
  • 10
  • 18
  • 1
    +1 nice. But it would be great if you can trace it by some example, otherwise it's hard to follow. – Trying Sep 26 '13 at 21:51
  • There were a suggested edit to this post that got rejected, containing a lot of code. The code is here in case someone is interested: http://pastebin.com/kASWRHdA – Matsemann Dec 09 '13 at 13:16
-1

I think you can write an O(k) algorithm that 'jumps' from the current minimal value to the next one, thus making at most k operations. You can use 2 references for that, one that indicates the current minimum value for all columns and one that indicates the current minimum value for all rows. You then advance on each iteration either by row or by column. After k iterations, you will get to the desired Kth value.

aUserHimself
  • 1,589
  • 2
  • 17
  • 26