2

I wrote a c program, compiled it and it ran fine. After a few compiles - it started giving me a segmentation fault. I renamed the folder, recompiled and it worked again.
Is this something normal? To have an inconsistent segmentation fault? I change the output name, change folder names etc.. and it bounces from giving segmentation fault to not giving seg fault. I don't know what to do anymore.
I mean, if it is a coding problem, seg fault should be consistent, right? I should get it every time. here's the code:
file my_set.c:

#include <stdio.h>
#include <stdlib.h>
#include "list.h"

/*
The program acceps a set of numbers from stdin until EOF
And then prints them (not storing duplicate numbers)
*/

int main ()
{
    int num; 
    nodePtr head; /*head of the list*/

    while (scanf("%d", &num) != EOF)
    {
        addToList(num, &head);
    }
    printList(head);
    freeList(head);
    return 0;
}

file list.c:

#include <stdio.h>
#include <stdlib.h>
#include "list.h"

/*
Implements a linked list, each element of which contains a dynamic array.
I used a linked list to maximize potential memory in case it is fragmented.
I use a dynamic array in each node to minimize the percentage of overhead
from creating a list (the pointer, the index...);
*/

/*
Adds number n to list *h
4 cases:
1. list is empty:
    creating one
    updating h with new list
    creating a new dynamic array in the list
    updating it and the index
2. can reallocate current node's array for 1 more int
3. cannot reallocate current node's array:
    creating a new node
    initializing it
4. cannot create a new node
    printing the current list, an "out of memory error" and freeing all memory.
*/
void addToList(int n, nodePtr *h)
{
    static nodePtr p; /*points to current last node*/
    int *temp; /*for use in reallocation*/

    if (!*h) /*first item of set*/
    {
        *h = malloc (sizeof(node));
        (*h)->arr = malloc(sizeof(int));
        (*h)->arr[0] = n;
        (*h)->i = 1;
        p = *h;
        return;
    }

    /*if n is already in the list, no need to add it
    the call comes after first item, because first item cannot be in the list*/
    if(existsInList(n, *h)) return;

    /*using realloc while still possible*/
    if ((temp = realloc(p->arr, (p->i+1)*sizeof(int))))
    {
        p->arr = temp;
        p->arr[p->i] = n;
        p->i++;
        return;
    }

    /*if  realloc no longet possible - start new node*/
    if ((p->next = malloc(sizeof(node))))
    {
        p = p->next;
        p->arr = malloc(sizeof(int));
        p->arr[0] = n;
        p->i = 1;
        return;
    }

    /*can no longer start new nodes - quit with error, after printing*/
    printf("out of memory!");
    printList(*h);
    freeList(*h);
}

/*checks if n is in p assuming p is not null
it can asume so because the call for it comes after the check for first item*/
int existsInList(int n, nodePtr p)
{
    int i;
    for (; p ; p = p->next)
        for (i = 0; i < p->i; i++)
            if (p->arr[i] == n)
                return 1;
    return 0;
}

/*frees the list*/
void freeList(nodePtr p)
{
    nodePtr temp = p;

    if (!p) return; /*list is empty*/

    while (p)
    {
        free(p->arr);
        p = p->next;
        free(temp);
    }
}

/*prints the content of the list to stdout*/
void printList(nodePtr p)
{
    if (!p) return;
    int i;
    printf("\n");
    for (; p ; p = p->next)
        for (i = 0; i < p->i; i++)
            printf("%d ", p->arr[i]);   
    printf("\n");
}

file list.h:

/*
pointer to a node
declare a variable of this type to create a list
then start adding to the list
*/
typedef struct s *nodePtr;

/*the struct that represents each node of the list
reason for dynamic array is in "list.c"
*/
typedef struct s
{
    int *arr;
    int i; /*index for next num, also size of array;*/
    nodePtr next;
}node;

/*Adds the int to list at nodePtr omitting duplicates*/
void addToList(int, nodePtr*);
/*prints a list*/
void printList(nodePtr);
/*returns 1 if an int exists in list referenced by nodePtr, 0 otherwise*/
int existsInList(int, nodePtr);
/*frees all dynamically allocated memory*/
void freeList(nodePtr);

Basically all I do is get numbers from stdin, put them in a list(no duplicates) and then print them. I use a list of dynamic arrays.

Daniel Shaulov
  • 2,354
  • 2
  • 16
  • 25

3 Answers3

6

Initialize your variables!

int num = 0;  
nodePtr head = NULL; /*head of the list*/

ADD: The inconsistent behaviour can come from debug vs release compilation, usually compilers in debug mode set the non-initialized variables to weird values like 0xDDDDDDDD to make the problem immediately visible. In release mode if the memory block is zeroed it will happen that the content of the variables is 0 but there is no guarantee for it.

jdehaan
  • 19,700
  • 6
  • 57
  • 97
  • That seems to fix it. I did not know that pointers are uninitialized. But I don't think it's debug/release causing inconsistency, I used the same makefile every compile. (and no -g flag) – Daniel Shaulov May 26 '11 at 11:02
  • The -g flag tells to include or not debug information but you really make a debug build (no optimization) `-O0` (or shortly `-O`) should be used as a compile switch. Under these conditions the compiler also reports more warnings. Probably also the uninitialized problem in this case. You can have an optimized executable with debug information, the -g tells only to add the symbols or not, but not to have an unoptimized debug build. Also consider using `-Wall` to enable all warnings. – jdehaan May 26 '11 at 12:35
  • what should be used for release? -O3? – Daniel Shaulov May 26 '11 at 13:37
1

Intermittent segfaults in c/c++ programs are usually caused by uninitialised memory, often in pointer variables.

You've posted a lot of code, which makes it hard to debug just be reading it. I suggest going through the code and, wherever a variable is declared, giving it an initial value (e.g. zero or NULL). Remember that the compiler will not initialise them for you.

You should probably start by initialising the values of num and head in main(). E.g.

int num = 0; 
nodePtr head = NULL; /*head of the list*/

EDIT 1

Another bug is in addToList(). If the first if block in that function is not executed then the value of the local variable p will be uninitailised when you later call realloc(p->arr, ...). When you dereference p to get p->arr, ifp` is uninitialised then you will usually get a segfault.

EDIT 2

Two useful techniques when programming in C/C++ :

  1. Always initialise variables at the point that you declare them. If you don't then their value is undefined. Note that this doesn't solve all problems. If you dereference an uninitialised pointer then you will usually get a segfault. If you initailise it to null and then dereference it then you will always get a segfault. Easier to debug, but it still crashes.
  2. Always declare variables as near as possible to the point in the code that you first use them. This has the effect of reducing the chances of using an uninitialised variable because the compiler will generate an 'undeclared variable' error. The practice of declaring all variables at the start of a function is a hangover from old-style 'K&R' C, where you had to do that. Modern C doesn't require it.

So, instead of:

int foo()  // Warning: bad code
{
    int a;
    int b;

    func1();
    a=func2(&b);
    return a;
}

try something like:

int foo()
{
    func1();
    int b = 42;
    int a = func2(&b);
    return a;
}
Andy Johnson
  • 7,938
  • 4
  • 33
  • 50
  • The whole point of the first if block is to run the first time the method is called, thus initializing p. – Daniel Shaulov May 26 '11 at 10:57
  • @Danish94 I understand that. The first `if` block in `addToList()` is only executed if the value pointed to by `h` is null. Assuming you're now initialising `head` then then this will happen the first time `addToList()` is called. But on subsequent calls to `addToList()` the value of `h` will be non-null and `p` will not be initialised and you'll have problems when you use in in the call to `realloc()`. – Andy Johnson May 26 '11 at 11:12
  • @AndyJohnson But that is what the static is for, to preserve it across multiple calls for addToList, so on subsequent calls, p is still initialized. – Daniel Shaulov May 26 '11 at 11:15
  • @Danish94 You're right. I didn't notice that p is static. That'll work. You should still initialise p though: see my second edit. – Andy Johnson May 26 '11 at 11:36
  • @AndyJohnson Is that practice ok in c90 or is it only in c99? – Daniel Shaulov May 26 '11 at 12:05
  • @Danish94 The requirement to declare all variables at the start of a function was removed when the language was standardised as [ANSI C](http://en.wikipedia.org/wiki/ANSI_C) in 1989. ANSI C is C89, which is basically the same as C90. – Andy Johnson May 26 '11 at 12:11
  • @AndyJohnson the answer to [this](http://stackoverflow.com/questions/288441/variable-declaration-placement-in-c) question says otherwise. – Daniel Shaulov May 26 '11 at 13:25
1

You should check the returns values from malloc() in case it's returning NULL (out of memory).

Graham Borland
  • 60,055
  • 21
  • 138
  • 179
  • The only mallocs not checked are in the first if of the addToList function, and it should only run once and it only allocates one struct and one int. The realloc and malloc that run multiple times are checked. – Daniel Shaulov May 26 '11 at 11:00
  • @Danish94 there's another unchecked malloc in the last if() {} of addToList(). – Graham Borland May 26 '11 at 11:28