I am interested in implementing the Rabin-Karp algorithm to search for sub strings as stated on wiki: http://en.wikipedia.org/wiki/Rabin-Karp_string_search_algorithm. Not for homework, but for self-interest. I have implemented the Rabin-Karp algorithm (shown below) and it works. However, the performance is really, really bad!!! I understand that my hash function is basic. However, it seems that a simple call to strstr() will always outperform my function rabin_karp(). I can understand why - the hash function is doing more work than a simple char-by-char compare each loop. What am I missing here? Should the Rabin-Karp algorithm be faster than a call to strstr()? When is the Rabin-Karp algorithm best used? Hence my self-interest. Have I even implemented the algorithm right?
size_t hash(char* str, size_t i)
{
size_t h = 0;
size_t magic_exp = 1;
// if (str != NULL)
{
while (i-- != 0)
{
magic_exp *= 101;
h += magic_exp + *str;
++str;
}
}
return h;
}
char* rabin_karp(char* s, char* find)
{
char* p = NULL;
if (s != NULL && find != NULL)
{
size_t n = strlen(s);
size_t m = strlen(find);
if (n > m)
{
size_t hfind = hash(find, m);
char* end = s + (n - m + 1);
for (char* i = s; i < end; ++i)
{
size_t hs = hash(i, m);
if (hs == hfind)
{
if (strncmp(i, find, m) == 0)
{
p = i;
break;
}
}
}
}
}
return p;
}