1

My code is having segmentation fault (core dumped) here. I'm not pretty sure which line is causing it since I'm pretty new to C. What I am trying to do here is to implement a binary search tree for each line on the file (only for insertion and search). Each line will not be more than 1000 words.

Here's my code:

BST *bst_ins(BST *bst, char *key, void *value)
{
  BST* temp = NULL;
  int cmp = strcmp(bst->kvp.key, key);
  if(bst == NULL)
  { /* This for null node */
    temp = (BST*)malloc(sizeof(BST)); /* allocate the memory of struct for new node */
    temp->left = NULL;
    temp->right = NULL;
    temp->kvp.key = key;
    temp->kvp.value = value;
  }


  else if(cmp < 0) /* Current node less than input */
  {
    bst->right = bst_ins(bst->right,key,value);
  }

  else if(cmp > 0)
  {
    bst->left = bst_ins(bst->left,key,value);
  }

  return bst;
}


KVP *bst_get(BST *bst, char *key)
{
    KVP* return_this;
    if(bst!=NULL)
    {
        if(bst->kvp.key==key)
        {
            return return_this;
        }
        else if(strcmp(bst->kvp.key, key) < 0) /* Current node less than input */
        {
            return bst_get(bst->left, key);
        }
        else if(strcmp(bst->kvp.key,key) > 0)
        {
            return bst_get(bst->right,key);
        }
    }
    return 0;
}

Below is header.h

typedef struct {
    char *key;
    void *value;
} KVP;
typedef struct bst {
    struct bst *left;
    struct bst *right;
    KVP kvp;
} BST;

Could someone please help me figure out which line is causing it? Thanks.

EDIT:

SOLVED! FINALLY :D

Here's the main function.

int main()
{
char* str1=strdup("alokama");
char* str2=strdup("kokokoko");
BST *bst = NULL;
bst = bst_insert(bst,str1,NULL);
bst = bst_insert(bst,str2,NULL);
if(bst_get(bst,str1)){ printf ("yuhuu\n"); }
return 0;
}
sinsuren
  • 1,745
  • 2
  • 23
  • 26
  • 2
    `I'm not pretty sure which line is causing` --> use `gdb` --> u'll be sure. :-) – Sourav Ghosh Mar 24 '15 at 12:19
  • 1
    Standard Warning : Please [do not cast](http://stackoverflow.com/q/605845/2173917) the return value of `malloc()` and family in `C`. – Sourav Ghosh Mar 24 '15 at 12:20
  • 1
    When you test strings for equality, you should use `strcmp` (as in your less/greater-than comparisons), not `==`, which will test for identity. You can also save many calls to `strcmp` by finding the result of the comparison once and storing it. – M Oehm Mar 24 '15 at 12:22
  • 1
    At some point your head `BST *` will probably be `NULL`. In that case, `if(bst->kvp.key == NULL) ...` in `bst_ins` will dereference a null pointer. You probably mean `if (bst == NULL) ...`. – M Oehm Mar 24 '15 at 12:25
  • Best avoid `void *value;`. If you want different data types, it's better to use a union, and, why use a pointer and not the value of the required type? Also `temp->kvp.key = key;` stores the string pointer. You don't show the calling code, but this would have to point to memory that has been allocated, containing a copied string, otherwise there is a good chance that the pointer will later on no longer point to what it did at the time you made the tree insertion (when you process the next line). Better, is to allocate memory for the key in `bst_ins()` and copy the string. – Weather Vane Mar 24 '15 at 12:44
  • I don't really believe union would be a good idea. If there are 1000 potential data types, the union would have 1000 members. And if your data type is not in the list of 1000 supported data types, then you lose. – juhist Mar 24 '15 at 12:57
  • Your code doesn't have a main function. Please consider posting a complete code example with the main function, which crashes. – juhist Mar 24 '15 at 12:58

1 Answers1

0

When you call bst_ins, bst may be NULL, so you can't dereference it. Also, you define a temporary node, which is NULL. When you get to your string comparisons,

if (strcmp(temp->kvp.key, key) < 0) ...

you certainly dereference a NULL pointer. Not good. The code below fixes the issue and also calls strcmp only once per pass. Note how temp is defined only in the scope where a new node is created.

BST *bst_ins(BST * bst, char *key, void *value)
{
    int cmp;

    if (bst == NULL) {
        BST *temp = (BST *) malloc(sizeof(*temp));

        temp->left = NULL;
        temp->right = NULL;
        temp->kvp.key = key;
        temp->kvp.value = value;

        return temp;
    }

    cmp = strcmp(bst->kvp.key, key);

    if (cmp < 0) {
        bst->right = bst_ins(bst->right, key, value);
    } else if (cmp > 0) {
        bst->left = bst_ins(bst->left, key, value);
    }

    return bst;
}

You can now insert new nodes like so:

bst = bst_ins(bst, "plum", "1");
bst = bst_ins(bst, "apple", "2");
bst = bst_ins(bst, "orange", "3");
bst = bst_ins(bst, "kumquat", "4");

But that is a bit clumsy, because you have to assign the result to the root node, which is redundant and easy to forget. You also lose the useful possibility to return the (possibly new) node associated with the key.

A better approach might be to pass the address of the root node to functions that may change the tree. If you are not afraid of the (*bst) notation and the address-of poerator &, here goes:

BST *bst_ins(BST **bst, char *key, void *value)
{
    int cmp;

    if (*bst == NULL) {
        *bst = (BST *) malloc(sizeof(**bst));

        (*bst)->left = NULL;
        (*bst)->right = NULL;
        (*bst)->kvp.key = key;
        (*bst)->kvp.value = value;

        return *bst;
    }

    cmp = strcmp((*bst)->kvp.key, key);

    if (cmp < 0)  return bst_ins(&(*bst)->right, key, value);    
    if (cmp > 0)  return bst_ins(&(*bst)->left, key, value);

    return *bst;
}

and call it like so:

bst_ins(&bst, "plum", "1");
bst_ins(&bst, "apple", "2");
bst_ins(&bst, "orange", "3");
bst_ins(&bst, "kumquat", "4");

The function returns the node associated with the key, but you can easily ignore the return value.

Finally, make sure that your tree is properly deleted when you are done.

M Oehm
  • 28,726
  • 3
  • 31
  • 42
  • I see now. I was confused with the use of pointer and how to dereferences it. Thank you so much! – Ultimate Tc Mar 24 '15 at 22:02
  • I wonder though, does return *bst is returning the updated bst? or the new node? – Ultimate Tc Mar 24 '15 at 22:37
  • In the first piece of code, which resemles your approach, it returns the root node. That return value should not be ignoreed. In the second, pointer-to-pointer approach, it returns the updated node, which may safely be ignored, but which might be useful, for example: `BST* x = bst_ins(&bst, "x", NULL); /* do stuff */ x->kvp.value = p;`. (Well, maybe not _that_ useful, but you could, for example, add a node anyway as a kind of add-or-find and then increase a counter in `kvp`.) – M Oehm Mar 25 '15 at 06:11