-2

I have to implement a function that calculates the hash h_N of a string according to a formula h_0 = 1337, h_n = h_n-1 * 5 + c_n, where c_1 and c_n are the strings characters for my homework.

I thought it would be a great idea to calculate it recursively but unfortunately I get a segmentation fault when trying to remove the last character of my given string.

uint64_t hashString(char *c)
{
    if(strlen(c) == 1) {
        return 1337 * 5 + (int) *c;
    }

    int lc = c[strlen(c) - 1];
    c[strlen(c) - 1] = 0;
    return hashString(c) * 5 + lc;
}

I even tried to use memmove but it did nothing. What am I doing wrong?

GSerg
  • 76,472
  • 17
  • 159
  • 346
  • Why are you calling `strlen` so often? It is expensive – Ed Heal Nov 01 '19 at 21:00
  • You are calling your function [on a string literal](https://stackoverflow.com/q/164194/11683)? – GSerg Nov 01 '19 at 21:00
  • I think you're testing using a string from a read-only memory area. Please share the code that you're using to test your function – Andrea Corbellini Nov 01 '19 at 21:02
  • my 2 cents.. using recursion can result in compact, elegant, pretty code. But practically it is dangerous due to stack usage runaway. Unless you **_know_** your function won't recurse an unsafe number of times, it's better to use a looping construct. Hashing certainly isn't a place for recursion, you can hash any amount of data. – yano Nov 01 '19 at 21:03
  • @AnttiHaapala I tried to use `memmove(c, c, strlen(c) - 2)` to remove the last char but it did nothing. –  Nov 01 '19 at 21:08

2 Answers2

3

I thought it would be a great idea to calculate it recursively

It isn’t! Unless your compiler can optimize it away (which involves tail call elimination as a step, and your implementation isn’t even tail recursive), each recursive call will use stack space, leading to a stack overflow with relatively small numbers of characters. Your current implementation also

  • calls strlen several times for every character, which is extremely slow (quadratic time – calculating the hash takes time proportional to the square of the length of the string) unless optimized away by a smart compiler

  • mutates its input, which is extremely unusual for a hash

Use a loop instead, and make sure your parameter is a char const*.

#include "testlib.h"
#include "hash.h"

int main() {
    test_start("hash.c");
    test_equals_int64(hashString("Abc"), result, "hash of A is correct");
    return test_end();
}

… but in this code, the segfault is specifically caused by the attempted modification of the read-only string literal "Abc". If your compiler didn’t warn you about that, look up how to enable more warnings.

Ry-
  • 218,210
  • 55
  • 464
  • 476
  • While all that is true, it doesn't really address the question which was about a segmentation fault. – GSerg Nov 01 '19 at 21:02
  • @GSerg: “leading to a stack overflow”. I guess it could be caused by something else too. – Ry- Nov 01 '19 at 21:02
0

I ran your code and it runs fine. But then I am able to crash it too. i wrote this as main

int main(int argc, char *argv[] ) {
   uint64_t ret = hashString( argv[1] );
   return 0;
}

If argv[1] is not there, it crashes, otherwise it works fine. So i suspect you have not allocated the memory for char *c. or if it is a string of 0 length.

Nish
  • 379
  • 2
  • 9