-3

Hello i am doing a project where i have to implement a hashTable that stores words based on a hash function. On a stress test i get malloc(): memory corruption

The initial declaration of the hashTable

hashTable = (char**)malloc(hashSize[0] * sizeof(char*));

This is the function i wrote to add word to hashTable of hashSize:

    void addWord(char** hashTable, unsigned int hashSize, const char* word) {

    int bucketIndex = hash(word, hashSize);
    //printf("Word to add = %s, bucket = %d, hashTable size = %d\n", word, bucketIndex, hashSize);
    if(hashTable[bucketIndex] == NULL) {
        hashTable[bucketIndex] = (char*)malloc(strlen(word) * sizeof(char));
        strcpy(hashTable[bucketIndex], word);
        return;
    }
    /* checks for duplicats */
    int exists = 0;    
    char* heyStack = (char*)malloc(strlen(hashTable[bucketIndex]));
    memcpy(heyStack, hashTable[bucketIndex], strlen(hashTable[bucketIndex]));
    char* token = strtok(heyStack, " ");
    while(token) {
        if(strcmp(token, word) == 0) {
            exists = 1;
            break;
        }
        token = strtok(NULL, " ");
    }
    /* end check for duplicates */
    if(exists == 0) {
        size_t bucketSize = strlen(hashTable[bucketIndex]);
        hashTable[bucketIndex] = (char*)realloc(hashTable[bucketIndex], bucketSize + strlen(word) + 2);
        memcpy(hashTable[bucketIndex] + bucketSize, " ", 1);
        memcpy(hashTable[bucketIndex] + bucketSize + 1, word, strlen(word) + 1);

    }
}

I have a stress test that adds 20k words to the table and it always breaks on the same word (no 10k something)

Any ideas on what i am doing wrong ?

Tyvm

user1840302
  • 189
  • 3
  • 13
  • 1
    `malloc(strlen(word) * sizeof(char)` followed by `strcpy`. You must allocate one more byte, for the `nul` terminator. – Weather Vane Mar 14 '16 at 11:36
  • added the extra byte for the null terminator. Problem persists – user1840302 Mar 14 '16 at 11:38
  • I am not sure what you are doing with `heyStack`. You allocate memory based on `strlen` (no extra byte) but then you `memcpy`. Does that leave a string terminator behind? – Weather Vane Mar 14 '16 at 11:40
  • I see you use both memcpy() and strlen() on the same items. This is scary. Do you use null-terminated strings (in that case stlen() is ok, but using memcpy is a nonsense) or buffers who can contain any byte sequence (in that case you cannot use strlen() and you have to memorize buffers lengths in another buffer)? – jdarthenay Mar 14 '16 at 12:14
  • Note: They say [you shouldn't cast the result of `malloc()` in C](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc). – MikeCAT Mar 14 '16 at 13:25

1 Answers1

0

You have to terminate the "strings" before passing it to functions that deal with strings, such as strlen() and strtok().

  • Allocate size of the string and one more byte for terminating null-character.
  • Terminate the "strings" by adding null character.

Notes:

Corrected code:

void addWord(char** hashTable, unsigned int hashSize, const char* word) {

    int bucketIndex = hash(word, hashSize);
    //printf("Word to add = %s, bucket = %d, hashTable size = %d\n", word, bucketIndex, hashSize);
    if(hashTable[bucketIndex] == NULL) {
        size_t wordSize = strlen(word);
        hashTable[bucketIndex] = malloc(wordSize + 1); /* size +1 */
        memcpy(hashTable[bucketIndex], word, wordSize + 1); /* why did you use strcpy() only in here? */
        return;
    }
    /* checks for duplicats */
    int exists = 0;
    size_t dataSize = strlen(hashTable[bucketIndex]);
    char* heyStack = malloc(dataSize + 1); /* size +1 */
    memcpy(heyStack, hashTable[bucketIndex], dataSize + 1); /* size +1 */
    char* token = strtok(heyStack, " ");
    while(token) {
        if(strcmp(token, word) == 0) {
            exists = 1;
            break;
        }
        token = strtok(NULL, " ");
    }
    /* end check for duplicates */
    if(exists == 0) {
        size_t bucketSize = strlen(hashTable[bucketIndex]);
        size_t wordSize = strlen(word);
        hashTable[bucketIndex] = realloc(hashTable[bucketIndex], bucketSize + wordSize + 2);
        memcpy(hashTable[bucketIndex] + bucketSize, " ", 1);
        memcpy(hashTable[bucketIndex] + bucketSize + 1, word, wordSize + 1);
    }
    free(heyStack); /* do free what you allocated */
}

This code will be better if you add some code to check if malloc() and realloc() are successful.

Community
  • 1
  • 1
MikeCAT
  • 73,922
  • 11
  • 45
  • 70