1

i have the following question:

given an array of integers A, and given t not-overlapped sections [L1,R1],[L2,R2]...[Lt,Rt]

such that every section is start-index(Li) and end-index(Ri) at the array (A),

find a data structure that returns the top k-elements from these sections (the indices are at the sections and the elements are at the array A) at O(t + klogk) complexity.

lot of thanks to the helpers, Avihai.

2 Answers2

1

The problem's phrasing is a bit ambiguous: what counts as part of the operations to be done in complexity O(t+klogk)?

I see 3 options in answering this. For the first two, which I think are wrong ways to interpret the original question, I present answers. For the third and last option I have no solution. Still I thought it necessary to make the question clear and add my answers to the first two. Additionally, adding this long discussion of the matter in a comment would be confusing if not impossible.

Here are the options I see for answering the question I started with:

Option 1 - data structure build (using the sections) NOT included

From the current phrasing of the original question it sounds like the problem asks for a data structure that can be built beforehand, using the array A and the sections [L1,R1],[L2,R2],...,[Lt,Rt] and the only operations considered for the complexity requirement would be the operations done on the data structure to extract the k highest values.

That doesn't make a lot of sense as it's unclear why would the t be put in the complexity in this case? From this answer, which you linked to in your comment to Juan's answer, you can see that by building beforehand a heap using only the relevant elements you can get O(klogk) for getting the k highest value elements from the data structure (once it exists).

Option 2 - ALL the operations included

If we read it as if the problem asks for the data structure to be built as part of the operations considered for the complexity requirement (we only get the array and the list of indices [L1,R1,L2,R2,...,Lt,Rt] and have to work with that) then the complexity requirement is not possible to achieve. If you want to build a data structure you will at least need to go over all the elements that need to be inserted into the data structure, and if you don't want to build a data structure and you intend to use the array as it is then you will have to go over all the elements between which you require to find the max. Either way in case you get t=1 and L1=1, R1=n where n is the number of elements in the array A you will have to go over all the elements in A (you have no assumptions regarding A's elements' order). That means you need at least O(n) and the requirement is O(t+klogk). To make it even clearer that these are totally unrelated quantities - you can consider the case for k=1, meaning finding only the highest element, while A can be of arbitrary length (let's say 10,000).

Option 3 - data structure build only with array exempt

Which brings me to my guess of the original meaning behind the question: build a data structure from the given array A beforehand (the building of which will not be included in the complexity requirement) so that when later on you are given sections' indices [L1,R1],[L2,R2],...[Lt,Rt] and an integer k the data structure you built before will allow you to get the k highest elements among the sections defined by the newly given indices (L1,R1,L2,R2,...,Lt,Rt) in the original array A in complexity O(t+klogk). Unfortunately I did not come up with a solution to this problem yet - or with a proof that such a solution does not exist, but at least now the problem makes sense.

I would love to know which of the options was the original intention of the question, and if I come up with a solution to option 3 I shall edit this answer accordingly.

Community
  • 1
  • 1
et_l
  • 1,868
  • 17
  • 29
0

I'd build an indirection table T, that maps [0..t] to the indices mapped by the sections.

Then I'd run the heapify algorithm using this indirection table. Every time the algorithm accesses an A[i], it would be replaced by a A[T[i]].

Both indirection table construction and heapify are O(t). Getting the top k elements would cost O(t log n).

EDIT: Wikipedia explains in its article how to build heap in O(t).

Basically it's a sequence of bubble-down operations:

for i from floor(A.size()/2) to 0:
    bubble_down(A, i)

There is a proof in the article why this is O(t).

Juan Lopes
  • 10,143
  • 2
  • 25
  • 44
  • first, thank you. second, i see here a solution for the problem of "top-k elements" by heap of heaps (very nice solution) that works in order O(klogk), the problem is how to build this heap in order O(t)? your solution is O(tlogn) and i afraif it's unsatisfiable, i'm attaching the link of the solution(see the answer of Victor Nicollet). [link](http://stackoverflow.com/questions/11209556/print-the-biggest-k-elements-in-a-given-heap-in-oklogk) – Avihai Danino Feb 21 '13 at 20:45
  • @JuanLopes you added the details and explanation for building a heap in O(n), where n is the number of elements in the built heap. That's not O(t). In the question above t is the number of sections. – et_l Sep 07 '16 at 22:29
  • @et_l If I heapify an array with t elements, that's O(t), isn't it? I may be missing something, I remember little from this answer. – Juan Lopes Sep 07 '16 at 22:57
  • @JuanLopes yes, of course it is. But there is no array with `t` elements in the question. There are `t` pairs of indices defining sections of arbitrary lengths in the original array `A` if I understand correctly. I put my thoughts in an answer here because it's too long and complicated for a comment, but I think the phrasing of the question is wrong. The short story: I think it is meant to ask us to build a structure beforehand using only the array `A` and then given the sections (`L1,R1,...Lt,Rt`) and an integer `k` get the `k` highest value elements in complexity `O(t+klogk)`. – et_l Sep 08 '16 at 13:26