This question is very similar to rolling-hash, but there are some specifics regarding overflow/negative result which are still not clear to me.
I have as well checked out this Rabin-Karp implementation and have issues with the line bellow:
txtHash = (txtHash + Q - RM*txt.charAt(i-M) % Q) % Q;
I understand that the following expression might give negative result:
txtHash - RM*txt.charAt(i-M)
First question:
- if we always add Q, a large prime, can this result with negative number due to overflow ?
- If not, why not ? If yes, shouldn't this addition be done only if result is negative ?
Second question:
If, for a moment, we didn't care about negative numbers, would it be correct to write expression bellow ?
txtHash = (txtHash - RM*txt.charAt(i-M)) % Q;
Third question, this part confuses me most:
Lets assume that the overflow cannot happen when we add Q. Why is there left-most % Q operation over the leading digit ?
txtHash = (txtHash + Q - RM*txt.charAt(i-M) % Q ) % Q;
I have read the answer i have linked and according to the answer by Aneesh, and if i understood correctly expressions bellow should be similar:
hash = hash - ((5 % p)*(10^2 %p) %p)
txtHash = (txtHash + Q - RM*txt.charAt(i-M) % Q) % Q;
But i don't see why they are similar because in example with hash, % p is not calculated for previous hash value, however for txtHash we calculate % Q over the previous hash as well.