3

I have a string anna, where values of chars in the string are a = 1, n = 14 ( You can compute the value of other chars like ( char - 96 ) and hash function which looks like :

int hashCode( string s ) // s = "anna";
{
    k = 0;
    for ( int i = 0; i < s.length(); i++ )
      k = ( 7 * k + 3 * s[i] ) % 61;
    return k;
}

How do I find a string of length 3 where collision happens ( something smart )? The only way that comes to my mind is to calculate k of anna which is 29 and then somehow think of another string of length 3 which gives 29.

samgak
  • 23,944
  • 4
  • 60
  • 82
kvway
  • 494
  • 4
  • 16
  • 2
    you'll have to bruteforce it I'm afraid, yes. – Jean-François Fabre Nov 30 '16 at 18:17
  • 1
    I see that _61_ is prime and you are modulating by it. Welcome to the world of [the Discrete Logarithm Problem](https://en.wikipedia.org/wiki/Discrete_logarithm). – Spencer D Nov 30 '16 at 18:29
  • @Jean-FrançoisFabre: I wouldn't be so sure, there is a reason writing a good one-way cryptographic hash function is hard. That said, I'm no crypto or math expert, so I can't say off-hand how you'd reverse this properly, but when you're talking about length three strings of lower case letters, you don't need to be smart; the search space is only `26 ** 3 == 17,576` total strings, so exhausting them all is trivial. – ShadowRanger Nov 30 '16 at 18:31
  • @ShadowRanger I agree. But it's still bruteforce. It's just that you don't need that much force :) let's take it up to 8 letters it will be different. – Jean-François Fabre Nov 30 '16 at 18:32
  • @SpencerDoak: Of course, for a six bit prime, the discrete log problem is not particularly hard. Storing the complete table of discrete logs for all bases and all exponents for a prime of 61 would be what, 3600 entries at most? And again, my math is bad, some of the exponents might not require entries, so it might be a little smaller than my simple 60 * 60 calculation. – ShadowRanger Nov 30 '16 at 18:36
  • @Jean-FrançoisFabre: I suppose. When I hear "brute force" I think "naive solution trying every input until you get a hit". A solution that shaves off a meaningful number of bits of work is not "brute force" anymore, even if it's still expensive relative to the cost of computing a single hash. Of course, when you've only got 15 bits of work, you don't need to spend a lot of time looking to shave more of them off. :-) – ShadowRanger Nov 30 '16 at 18:43
  • 1
    @ShadowRanger, oh yeah, it's the same basic scenario as Jean-FrançoisFabre's comment: for this given scenario, finding a collision via bruteforce is not a difficult thing nor does it require a lot of space. However, this is an example of the discrete logarithm problem, and perhaps the OP (and maybe future readers) could benefit from learning a bit about the discrete logarithm problem and how it makes one-way functions difficult to reverse. Obviously, for this given scenario, bruteforcing or even generating a rainbow table wouldn't be too difficult or expensive (processing/storage-wise). – Spencer D Nov 30 '16 at 19:19
  • @Jean-FrançoisFabre even with 8 letters, if you stop searching as soon as you get a collision it won't take long. There are only 61 possible hash values, so if you just randomly guess strings you'll get one after 30 or so guesses on average. – samgak Nov 30 '16 at 20:44
  • @samgak it makes sense. So even with 40 letters it would be easy. Fun! – Jean-François Fabre Nov 30 '16 at 20:46

1 Answers1

4

Why don't you simply generate all strings of length 3 and compute their hashes (in fact, you can stop when all 61 possible values of hash functions can were reached)? The number of options is not too large.

One possible optimization: if you need to answer multiple queries (like given a string, find a string of length 3 with the same value of the hash function), you can generate all strings of length 3 and compute their hashes once to build a map hash -> a string of length 3 with this hash. Once you have this map, finding a string of length 3 with the same hash as the given one is just one map lookup.

kraskevich
  • 18,368
  • 4
  • 33
  • 45