1

I'm trying to implement a program similar to 20 Questions, in which a text file of the questions and guesses for answers are loaded, copied into a char array (where the new space lines are replaced by '/0' in order to split the questions into individual strings). The array works fine after the text file is copied into it. A tree structure is set up to organize the phrases into the yes/no question tree, where the left child is the yes response, while the right is the no response, and leaves are the guesses that the program uses to guess at the end.

The issue that I'm having is that after I build the tree (call treeBuilder from InitTree), the contents of the array where the phrases from the text file were copied became corrupted.

Before I call InitTree, the array contents look like this:

Is it furry? Does it meow? a cat a dog Does it have tusks? Does it have big ears? an elephant a rhinoceros an alligator

After calling it, it looks like this:

Is it furry? -???` ?p ?a dog Does it have tusks? Does it have big ears? an elephant a rhinoceros an alligator

I've been testing where it stops working, and within treeBuilder, all of the elements of the array are intact, but as soon as the function call to treeBuilder ends, the array becomes corrupted. I've tried protecting the memory by using calloc whenever I allocate memory, and even by making the character array static, which worked in a similar situation where this happened. But all of my preventative measures don't seem to be working and I'm not sure where the problem lies. I've already tried looking at similar cases here on stackoverflow but I couldn't anything that related to my issue.

This eventually leads to a seg fault, when the program actually starts to use the tree, for obvious reasons.

I've tried running gdb, but for whatever reason it won't let me go through line by line, because it cannot find the line information, and just skips everything until it either prompts for input, or gets a memory error or something, so running gdb isn't very helpful here. I'm guessing this might be because the main function is in an included file or something. But that's beside the point.

Here's the code related to the problem:

struct treeStruct {
    char *string;
    struct treeStruct *left, *right;
};

typedef struct treeStruct *TreeType;

// Builds a tree
void treeBuilder(TreeType tree, char **phrase, long level){
    // Gets the level (number of tabs) of the next phrase
    long nextLevel = countTabs(*phrase + strlen(*phrase) + 1);

    tree->string = *phrase + level; // Assigns the response pointer to the tree array

    // Move the pointer to the next string, since the the strings need to be
    // put into the tree in linear order
    (*phrase) += strlen(*phrase) + 1;

    if (level >= nextLevel){
    // Compares the current level with the level of the next string
            // to determine if returning up the tree is necessary;
            // This should be the answer to a question.
            tree->left = NULL;
            tree->right = NULL;
            return;
    }
    else{
            // Makes sure the left and right pointers of the struct have
            // allocated space
            tree->left = calloc(1, sizeof(TreeType));
            tree->right = calloc(1, sizeof(TreeType));

            // Adds the yes and no branches to the tree, recursion will take care
            // of adding sub-branches
            treeBuilder(tree->left, phrase, level + 1);
            treeBuilder(tree->right, phrase, level + 1);
    }

    return;

}


TreeType InitTree (char *file){
    if(file == NULL){
            printf("File '%s' does not exist.\n", file);
            exit(2);
    }

    FILE *fp;
    fp = fopen(file, "r");

    // Create a space in memory for the loaded questions to occupy
    static char *phrases;
    phrases = (char *)malloc(MAXSTR * MAXNUMQS * sizeof(char));

    copyText(fp, phrases);

    fclose(fp);

    // Create space in memory for the tree structure
    TreeType tree;
    tree = (TreeType) calloc(1, sizeof(TreeType));

    // Create a pointer to a pointer so that treeBuilder can
    // change what the first pointer is pointing to, so the strings in
    // phrases can be added in order throughout the recursion
    static char *phrase_ptr, **phrase_ptr2;
    phrase_ptr = &phrases[0];
    phrase_ptr2 = &phrase_ptr;

    //Build the tree
    treeBuilder(tree, phrase_ptr2, 0);

    topNode = tree;

    return tree;
}

Sorry if this is tl;dr, but I wanted to be as clear as possible on my issue.

jtcramer
  • 107
  • 3
  • 9
  • 1
    It would be great if you could provide what is called a http://sscce.org example. The minimal, but compilable code that can still reproduce your problem. This might also help you find the bug yourself. – ArjunShankar Apr 06 '12 at 08:20
  • e.g. here you open a file and read it with a function we don't even get to see. Instead maybe `strcpy` a string constant into `phrases`. – ArjunShankar Apr 06 '12 at 08:21
  • When you're recursing, do you really want `level + 1`? or the `nextLevel` that you previously calculated? – jpm Apr 06 '12 at 08:23

1 Answers1

3

Just one thing I noticed is that you're using sizeof(TreeType), but TreeType is a pointer to a struct and not the struct itself. This means that you are creating a pointer that is pointing to nowhere, and that dereferencing the pointer will lead to undefined behaviour. Which having just read the rest of the question would certainly explain the segfaults.

I think you would be better off not typedef-ing your struct as a pointer, and be more explicit with your use of pointers.

eg.

typedef struct treeStruct TreeType;

void treeBuilder(TreeType *tree, char **phrase, long level){
    ...
    if (!tree->left) {
        // calloc returns a pointer to a new bit of memory that has been 
        // assigned on the heap
        TreeType *temp = calloc(1, sizeof(TreeType));
        // assignments below not explicitly needed as you're using calloc
        temp->string = NULL;
        temp->left = NULL;
        temp->right = NULL;

        tree->left = temp;
    }
    ...
}

Here's a question about typedef-ing pointers. Seems to be relatively common in C, and used to imply that the data type is opaque and should not be dereferenced by the user (only by the API calls that the user passes it to).

Community
  • 1
  • 1
Dunes
  • 37,291
  • 7
  • 81
  • 97
  • I would normally do that, but the assignment I'm doing required that TreeType be a pointer. – jtcramer Apr 06 '12 at 08:29
  • @jtcramer: maybe the purpuse of the assignment is to teach you not to hide pointers behind typedefs? – wildplasser Apr 06 '12 at 08:32
  • Use `sizeof(struct treeType)` to easily assign the memory then. – Dunes Apr 06 '12 at 08:34
  • Actually, that seemed to be the source of the problem. I changed sizeof(TreeType) to sizeof(treeStruct) and it seemed to fix the problem. Thanks! – jtcramer Apr 06 '12 at 08:35