0

Okay, I put together a simplified example of the problem code:

#include "stdio.h"
#include "string.h"

struct Trie{
    //Holds sub-tries for letters a-z
    struct Trie *sub[26];
    //Is this a substring, or a complete word?
    int is_word;
};
typedef struct Trie Trie;

Trie dictionary;

int main(int argc, char *argv[]){
    //A list of words
    char *words[7] = {"the","of","and","to","a","in","that"};

    //Add the words to the Trie structure
    int i=0, wordlen;
    Trie *sub_dict;
    for (;i<7; i++){
        //Reset
        printf("NEW WORD\n");
        sub_dict = &dictionary;
        //Add a word to the dictionary
        int j=0, c;
        while (c = words[i][j], c != '\0'){
            printf("char = %c\n",c);
            //Initialize the sub-Trie
            if (sub_dict->sub[c-97] == NULL)
                sub_dict->sub[c-97] = (Trie*) malloc(sizeof(Trie*));
            //Set as new sub-trie
            sub_dict = sub_dict->sub[c-97];
            j++;
        }
        sub_dict->is_word = 1;
    }
}

Basically, I have a Trie data structure that holds the letters "a" through "z". I have a list of words that should get added inside the while loop. Unfortunately, I get a segmentation fault at different points in the loop (depending on when I run it).

I'm guessing the problem has something to do with the line
sub_dict->sub[c-97] = (Trie*) malloc(sizeof(Trie*));
but I'm new to C, so I have absolutely no clue what's going on.

Azmisov
  • 6,493
  • 7
  • 53
  • 70

2 Answers2

2

sub_dict->sub[c-97] = (Trie*) malloc(sizeof(Trie*)); there is an error.

sizeof(Trie*) will be 4 in 32bit os, because Trie* is a pointer, and the size of a pointer in 32bit os is 4. you can do like this: sub_dict->sub[c-97] = (Trie*) malloc(sizeof(Trie));

sumous
  • 365
  • 4
  • 12
  • Why does it segfault if I do `Trie temp; sub_dict->sub[c-97] = &temp;` instead? Shouldn't that do the same thing as using `malloc`? – Azmisov Oct 24 '12 at 03:17
  • 1
    Because you go writing beyond the end of the memory that was allocated, so you trampled over `malloc()`'s control information, so it got confused about which memory it should give you next. – Jonathan Leffler Oct 24 '12 at 03:18
1

You seem to presume that when you do

something = (Trie*) malloc(sizeof(Trie*));

then the contents of that structure is initialized to zeroes (e.g. every member will start as NULL). This is not the case with malloc(). You have to either use calloc, or use memset() to reset it after allocation.

In fact, I'd call memset even on your starting dictionary to be on the safe side. (Even though global and static variables are apparently initialized to zero, so this might not be necessary for your case.)

Community
  • 1
  • 1
che
  • 12,097
  • 7
  • 42
  • 71
  • So, will the structure just be filled with a bunch of garbage then? Is it only by luck that `sub_dict->sub[c-97] == NULL` gives the intended result? – Azmisov Oct 24 '12 at 03:18
  • 1
    Yes; use `calloc()` to get zeroed memory. The data returned by `malloc()` may contain any sort of garbage; it is at most accidentally zeroed. – Jonathan Leffler Oct 24 '12 at 03:19