0

When I compile and run my code it sort of looks like the code works, but when I run Valgrind with help50, it says "==1295== Conditional jump or move depends on uninitialized value(s)" I don't get how to fix this problem. Help50 tells me to focus on line 54 of my code, but I don't understand what is wrong, it was working before, now it is a mistake. What I don't understand is what the mistake is/

#include <ctype.h>
#include <stdio.h>
#include <stdbool.h>
#include <string.h>
#include <stdlib.h>
#include "dictionary.h"

// Represents a node in a hash table
typedef struct node
{
    char word[LENGTH + 1];
    struct node *next;
}
node;
// Number of buckets in hash table
const unsigned int N = 10000;
int i = 0;

// Hash table
node *table[N];

// Returns true if word is in dictionary else false
bool check(const char *word)
{
    char lword[LENGTH+1];
    for(i = 0; i < strlen(word); i++)
    {
        lword[i] = word[i];
        lword[i] = tolower(lword[i]);
    }
    node *current;
    int hashnum = hash(lword);
    if(table[hashnum] == NULL)
    return false;
    current = table[hashnum];
    while(current->next != NULL)
    {
        if(strcmp(current->word, word) == 0)
        return true;
        else
        current = current->next;
    }
    return false;
}

// Hashes word to a number
// Hash function from cs50.stackexchange
unsigned int hash(const char *word)
{
    int n;
    unsigned int hash_value = 0;
    for (i = 0, n = strlen(word); i < n; i++)
    {
         hash_value = (hash_value << 2) ^ word[i];
    }
    return hash_value % N; //N is size of hashtable
}
// Loads dictionary into memory, returning true if successful else false
// adopted from github user
int word_count = 0;
bool load(const char *dictionary)
{
    FILE *file = fopen(dictionary, "r");
    if (file == NULL)
    {
        return false;
    }
    char word[LENGTH + 1];
    while (fscanf(file, "%s", word) != EOF)
    {
        node *new_node = malloc(sizeof(node));
        if (new_node == NULL)
        {
            free(new_node);
            return false;
        }
        strcpy(new_node->word, word);
        int h = hash(new_node->word);
        node *head = table[h];
        if (head == NULL)
        {
            table[h] = new_node;
            word_count++;
        }
        else
        {
            new_node->next = table[h];
            table[h] = new_node;
            word_count++;
        }
    }
    fclose(file);
    return true;
}


// Returns number of words in dictionary if loaded else 0 if not yet loaded
unsigned int size(void)
{
    return word_count;
}

// Unloads dictionary from memory, returning true if successful else false
bool unload(void)
{
    for(i=0;i<10000;i++)
    {
        for(i = 0; i < 10000; i++)
    {
        while(table[i] != NULL)
        {
            char *retval = NULL;
            if (table[i]->next == NULL)
            {
                retval = table[i]->word;
                free(table[i]);
                return retval;
            }
            else
            {
                node * current = table[i];
                while (current->next->next != NULL)
                {
                    current = current->next;
                }
                retval = current->next->word;
                free(current->next);
                current->next = NULL;
            }
        }
    }
    }
return true;
}
  • In `load` you are not setting the `new_node->next` field for the `if (head == NULL)` case. Should be set to `NULL`. – kaylum Oct 14 '20 at 23:02
  • 1
    Seperate problem: `while(current->next != NULL)` looks wrong as it will skip the last element in the chain. Should be `while(current != NULL)` – kaylum Oct 14 '20 at 23:04
  • 1
    Another one: [Why is “while ( !feof (file) )” always wrong?](https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong) – kaylum Oct 14 '20 at 23:05
  • OT: for ease of readability and understanding: Please consistently indent the code. (don't use tabs) – user3629249 Oct 15 '20 at 00:16
  • regarding: `for(i = 0; i < strlen(word); i++)` The function: `strlen()` returns a `size_t` ( a unsigned long int ) but the variable `i` is an `int` This is comparing a signed value to a unsigned value. This will 'usually' work, but much better if `i` were declared as `size_t` – user3629249 Oct 15 '20 at 00:40
  • regarding: `const unsigned int N = 10000;` and `node *table[N];` in C, this is trying to initialize an array from a 'non-constant' value in 'file space'. Suggest changing the declaration of `N` to: `#define N 10000` – user3629249 Oct 15 '20 at 00:44
  • regarding: `int hashnum = hash(lword);` This is calling the `hash()` function before the function has been defined. The result is the compiler will use 'default' type `int` for the parameter. Suggest either placing a prototype for `hash()` right after the `#include` statements or moving the declaration of the `hash()` function before the first call to that function. – user3629249 Oct 15 '20 at 00:47
  • regarding: `lword[i] = tolower(lword[i]);` The function: `tolower()` has the prototype: `int tolower(int c);` so this is performing an implicit conversion from `int` to `char` This 'usually' will work, but much better to specifically cast the value like: `lword[i] = (char)tolower( lword[i] );` – user3629249 Oct 15 '20 at 00:51
  • regarding: `if (new_node == NULL) { free(new_node); return false; }` since the contents of `new_node` is NULL, that means no memory was allocated. Although `free()` properly handles the event where the parameter contains `NULL`, there is no need to clutter the code with the call to `free()` – user3629249 Oct 15 '20 at 00:55
  • regarding: `#include "dictionary.h"` and `char word[LENGTH + 1];` The header file: `dictionary.h` is 'home grown' so need to post a few critical lines from that file. The macro `LENGTH` is not defined anywhere in the posted code, – user3629249 Oct 15 '20 at 00:57
  • when compiling, always enable the warnings, then fix those warnings. ( for `gcc`, at a minimum use: `-Wall -Wextra -Wconversion -pedantic -std=gnu11` ) Note: other compilers use different options to produce the same results – user3629249 Oct 15 '20 at 01:00
  • regarding: `if (file == NULL) { return false; }` should always inform the user when an error occurs. Suggest: `if (file == NULL) { perror( "fopen for input file failed" ); return false; } – user3629249 Oct 15 '20 at 01:04
  • 1
    You may find [CS50 Speller Segmentation Fault Issue During Misspelled Words](https://stackoverflow.com/a/63681299/3422102) helpful – David C. Rankin Oct 15 '20 at 01:13
  • the function: `hash()` (and other places) is using the original capitalization of the 'word[]'. Therefore `The` and `the` will not match, but they should match. This is a major logic flaw in the posted code that needs to be corrected – user3629249 Oct 15 '20 at 01:14
  • regarding: `while (fscanf(file, "%s", word) != EOF)` 1) much better to check for the 'positive' value. I.E. 1 rather than EOF. 2) the input conversion specifier: `%s` allows for an unlimited number of characters to be input. This 'could' overflow the input buffer. Suggest using a MAX CHARACTERS modifier that is 1 less than the length of the input buffer to avoid any chance of a buffer overflow and the resulting undefined behavior. Note -1 because the `%s` always appends a NUL byte to the input – user3629249 Oct 15 '20 at 01:26
  • which is line 54? please indicate that line in the posted code – user3629249 Oct 15 '20 at 01:27
  • In functions `check` and `hash` you are using a global variable `i` as your loop counter. It is not a good style to use global variables unless needed. A loop counter is clearly no such situation where a global variable might be justified. – Gerhardh Oct 15 '20 at 07:13

1 Answers1

0

You have a large number of problems. The primary problem leading to the valgrind conditional move based on uninitialized value is your failure to initialize the ->next pointer NULL in load(), e.g.:

        node *new_node = malloc(sizeof(node));
        if (new_node == NULL)
        {
//             free(new_node);                  /* not necessary */
            return false;
        }
        strcpy(new_node->word, word);
        new_node->next = NULL;                  /* must initialize NULL */

That will resolve the valgrind issues.

C is not C++. const unsigned int N = 10000; does not create an integer constant resulting in node *table[N]; being a VLA (Variable Length Array) of pointers to node rather than an array of pointers to node. This is a problem resulting in the error:

error: variably modified ‘table’ at file scope

(it is unclear how you were able to compile with gcc with the VLA. See C11 Standard - 6.7.6.2 Array declarators(p2))

Instead, you need to #define the value for N, use a global enum or move the declaration of table to block or function scope. (note, N should be at least ten-times as large to keep the number of hash collisions -- and the hash table Load Factor [buckes_used/total_buckets] below 0.7)

Type Matters

Your hash function is declared as:

unsigned int hash(const char *word)

It returns type unsigned int, you should not be assigning the result to int. While the modulo will keep the value returned in the range of positive integer values, that is playing with fire.

//     int hashnum = hash(lword);
    unsigned int hashnum = hash(lword);

In other circumstances, if bit-31 is 1, your assignment to int will result in the value being negative -- and if used as an array index, results in Undefined Behavior.

The unload() function is much much more convoluted that necessary. You have a global array table used as your hash table. The only thing that needs to be freed are any nodes in buckets that are not empty. So you simply need to loop over each bucket and check if it is empty. If not, then traverse the list that starts with the bucket element freeing each node, e.g.:

bool unload(void)
{
//     for(size_t i=0; i<10000;i++)       /* you have a constant -- use it */
    for(size_t i=0; i<N;i++)
    {
        node *n = table[i];
        while (n) {
            node *victim = n;
            n = n->next;
            free (victim);
        }
    }
    
    word_count = 0;
    
    return true;
}

Unnecessary Code

There are several places where you code isn't wrong, but involves additional copying or unnecessary calls to free(), etc.. For example in check(), there is no need to assign to lword[i] just to assign the value again after converting to lowercase. Just convert to lowercase and assign:

    for (size_t i = 0; i <= strlen(word); i++)
        lword[i] = tolower(word[i]);
//     {
//         lword[i] = word[i];
//         lword[i] = tolower(lword[i]);
//     }

In load(), if the allocation of new_node fails, there isn't any need to free(new_node);, as shown earlier. Why? When malloc() fails, it returns NULL. There is nothing allocated. Though free(NULL); is harmless (free() will check for you), it's simply unnecessary.

Resulting Memory Use

If you make all of the changes, your code will now run without memory error and will free the memory it allocated, e.g.

$ valgrind ./bin/speller texts/lalaland.txt > test/lalaland.txt
==28984== Memcheck, a memory error detector
==28984== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==28984== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==28984== Command: ./bin/speller texts/lalaland.txt
==28984==
==28984==
==28984== HEAP SUMMARY:
==28984==     in use at exit: 0 bytes in 0 blocks
==28984==   total heap usage: 143,095 allocs, 143,095 frees, 8,022,392 bytes allocated
==28984==
==28984== All heap blocks were freed -- no leaks are possible
==28984==
==28984== For counts of detected and suppressed errors, rerun with: -v
==28984== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I haven't check correctness of all answers, but for lalaland.txt the answer appears correct.

Look things over and let me know if you have further questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85