4

Continued from my last two questions, "Range Minimum Query approach (from tree to restricted RMQ)" and "Range Minimum Query approach (Last steps)"

I followed this tutorial on TopCoder, and the approach is introduced in the last section.

Now assuming I have everything done, and I am ready for query. According to the tutorial, this is what I should do:

i and j are in the same block, so we use the value computed in P and T

For example, if there's a block like this:

000111

The minimum value lies of course in the third 0, but if i and j are like 4 and 6, the third 0 won't lie in the queried criteria. Is my understanding wrong?

i and j are in different blocks, so we compute three values: the minimum from i to the end of i's block using P and T, the minimum of all blocks between i's and j's block using precomputed queries on A' and the minimum from the begining of j's block to j, again using T and P; finally return the position where the overall minimum is using the three values you just computed.

Why compute the minimum from i to end of i's block and the minimum of start of j's block to j? Don't the answer of both lies outside of i...j? Also, how to do that if it's not entirely a fit just like the last question.

Community
  • 1
  • 1
Shane Hsu
  • 7,937
  • 6
  • 39
  • 63

1 Answers1

3

The minimum value lies of course in the third 0, but if i and j are like 4 and 6, the third 0 won't lie in the queried criteria. Is my understanding wrong?

The idea is to precompute the RMQ for all pairs of indices in every possible block. As a result, regardless of what indices you query within that block, you should always be able, in O(1) time, to read off the RMQ of the two values within the block. In the case you listed in your question, the fact that indices 4 and 6 don't contain the block minimum is true but irrelevant. You'll already have the RMQ precomputed for indices 4 and 6.

Why compute the minimum from i to end of i's block and the minimum of start of j's block to j? Don't the answer of both lies outside of i...j? Also, how to do that if it's not entirely a fit just like the last question.

Consider this picture:

+------+------+------+------+------+------+
| ?i?? | ???? | ???? | ???? | ??j? | ???? |
+------+------+------+------+------+------+
   ^                            ^
   i                            j

If you want to solve RMQ(i, j), then the minimum could be in one of three places:

  • In the same block as i, at an index from the position of i within its block to the end of its block,
  • In the same block as j, at an index from 0 to the position of j within its block, or
  • Somewhere in one of the middle three blocks.

The algorithm works by using the precomputed tables to solve the problem in the first two cases, then using the other algorithm to solve it for the third case. The minimum of these three should be your answer.

Hope this helps! This is by no means an easy algorithm, so please feel free to ask more questions here if you need help!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • so P and T actually have all those combinations and all the possible indices combination? How will that data structure be? Can you add it to the answer? – Shane Hsu Feb 09 '13 at 04:55
  • I see no reference about computing for each pair of indices in the tutorial, I think that'll be O(N^2). – Shane Hsu Feb 09 '13 at 04:55
  • @ShaneHsu- For the part about comparing all pairs of indices within a block: "So, for each binary block of size l we need to lock up in a table P the value for RMQ between every pair of indices. This can be trivially computed in O(sqrt(N)*l2)=O(N) time and space" – templatetypedef Feb 09 '13 at 04:56
  • @ShaneHsu- The block size is chosen as log(n / 2) so that there are only (log(n / 2))^2 pairs of indices per block, which is smaller than O(N). The number of blocks ends up being sqrt(N), so the total work done to compute all indices is therefore O(sqrt(N) log(N)^2), which is smaller than O(n). – templatetypedef Feb 09 '13 at 04:57
  • @ShaneHsu- As for the data structure that you would use - you could have a three-dimensional array, where the first index is the block index, the second index is the start position in the block, and the third index is the end position within the block. This is probably the easiest way to do this. – templatetypedef Feb 09 '13 at 04:58
  • Ok, thanks! So it's a lot of math to prove it in linear time! I think I can complete the code today! – Shane Hsu Feb 09 '13 at 04:59
  • @ShaneHsu- Yep - the choice of log(n / 2) is done specifically to make the math work out correctly. This is not a trivial algorithm to implement or analyze! – templatetypedef Feb 09 '13 at 05:00
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/24222/discussion-between-shane-hsu-and-templatetypedef) – Shane Hsu Feb 09 '13 at 05:20
  • After a few days, the code is generally completed, but still has a few bugs. I will post the code once it's all complete. But it seems by the numbers of line, it won't be much faster or even slower. – Shane Hsu Feb 12 '13 at 04:07
  • @templatetypedef- About the time complexity of calculating pairs of indices per block: If n=20, we have 4 blocks and each block has length 5. There are 14 pairs of incides per block, so the preprocessing takes 70 operations rather than 20. That can't be O(n), can it? – Atte Juvonen May 21 '16 at 17:52
  • @AtteJuvonen You're right that you have to be careful about how you choose the block size. If you make the blocks too small, you don't get much value from precomputing things, but if you make the blocks too big, you spend too much time precomputing. If you choose the block size to be b, then there are 2^b possible blocks and b^2 possible queries within each block. You need to choose a value of b such that 2^b * b^2 happens to be O(n). One way to do this is to pick, say, b = lg (sqrt n). – templatetypedef May 21 '16 at 17:59
  • Ah, I was calculating block size incorrectly. Thanks! – Atte Juvonen May 21 '16 at 18:04