0

I have written the below shared code - for an exercise - which compiles without warnings and errors but, when I try to run it, it doesn't seem to work and it outputs the Process returned -1073741819 (0xC0000005) error before crashing. I have tried to search for any explaination for this behavior although I did not succed in doing so; I am still learning so any and all advice is welcome. Also, the values in the users.csv file can be invented but they should be of the form: <identification number>;<movies watched>;<age>;<occupation>

/*
Write a program that takes in input a file with data relative to users from
a streaming site, loads them in memory in a structured type variable and, 
after having sorted them by age in ascending order, writes in a new file
the data relative to only the users that are students.
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

const int MAX = 100;                // I wasn't able to use a preprocessor command (such as '#define MAX 100') | error: expected identifier or '(' before numeric constant | so I declared a constant global variable

typedef struct {
    int id, age, watched_movies;
    char occupation[30];
} USER[100];                        // can't initialize with MAX ( ==> cannot initialize arrays with variable sized elements)

//Written 'MAX' as 'max' to avoid variable shadowing and confusion in function building
void sort_users(USER* U, int max);
void print_students(USER* U, int max);

int main(void)
{
    FILE* usersPtr;

    if ((usersPtr = fopen("users.csv", "r")) == NULL) {
        puts("File 'users.csv' could not be opened.");
        exit(1);
    } else {
        USER U[100] = {{{0, 0, 0, ""}}};    // added braces to remove compiler warning
        for (int i = 0; i < MAX; ++i) {
            U[i]->id = 99999;
            U[i]->age = 0;
            U[i]->watched_movies = 0;
            strcpy(U[i]->occupation, "");
        }

        int i = 0;
        char string[60] = {""};
        char* strPtr;
        while (!feof(usersPtr)) {
            fscanf(usersPtr, "%s", string);

            strPtr = strtok(string, ";");
            U[i]->id = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            U[i]->age = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            U[i]->watched_movies = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            strcpy(U[i]->occupation, strPtr);

            i++;
        }
        strPtr = NULL;

        sort_users(U, MAX);
        print_students(U, MAX);

        printf("%s", "The 'students.csv' file has been created and filled with the relevant pieces of information.\n");

        fclose(usersPtr);
        usersPtr = NULL;
    }
    return 0;
}

void sort_users(USER* U, int max)       // ("bubble sorting")
{
    for (int i = 0; i < max; ++i) {
        for (int j = max - 1; j != i; --j) {

            if (U[j]->age <= U[i]->age) {
                USER T[1] = {{{0, 0, 0, ""}}};                  // create a temporary array of structs (of 1 element)

                T[0]->id = U[i]->id;
                T[0]->age = U[i]->age;
                T[0]->watched_movies = U[i]->watched_movies;
                strcpy(T[0]->occupation, U[i]->occupation);

                T[i]->id = U[j]->id;
                T[i]->age = U[j]->age;
                T[i]->watched_movies = U[j]->watched_movies;
                strcpy(T[i]->occupation, U[j]->occupation);

                T[j]->id = U[0]->id;
                T[j]->age = U[0]->age;
                T[j]->watched_movies = U[0]->watched_movies;
                strcpy(T[j]->occupation, U[0]->occupation);
            }
        }
    }
}

void print_students(USER* U, int max)
{
    FILE* studentsPtr;

    if ((studentsPtr = fopen("students.csv", "w")) == NULL) {
        puts("File 'students.csv' could not be created.");
        exit(1);
    } else {
        for (int i = 0; i < max; ++i) {
            if (strcmp(U[i]->occupation, "student") == 0) {
                fprintf(studentsPtr, "%d,%d,%d,%s\n", U[i]->id, U[i]->age, U[i]->watched_movies, U[i]->occupation);
            }
        }
        fclose(studentsPtr);
        studentsPtr = NULL;
    }
}

users.csv example:

1234;18;10;student
3253;22;3;sommelier
1;16;100;teacher
1743;50;5;retired
9654;28;18;student
Gisvaldo
  • 21
  • 4
  • 4
    0xC0000005, happens when the program accesses memory it does not own. Use a debugger. – Hans Passant Apr 24 '23 at 18:02
  • It looks like you `T` temporary array is too small to accommodate all possible values that `i` and `j` can take in the loop. – 500 - Internal Server Error Apr 24 '23 at 18:06
  • 1
    Have you tried running your code line-by-line in a debugger while monitoring the control flow and the values of all variables, in order to determine in which line your program stops behaving as intended? If you did not try this, then you may want to read this: [What is a debugger and how can it help me diagnose problems?](https://stackoverflow.com/q/25385173/12149471) You may also want to read this: [How to debug small programs?](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) – Andreas Wenzel Apr 24 '23 at 18:07
  • Most likely, you're on *Win*. Use *MAX\_USERS* instead. Are you sure about `typedef struct {...} USER[100];` and `USER U[100] = ...`? – CristiFati Apr 24 '23 at 18:10
  • 1
    Questions seeking debugging help should generally provide a [mre] of the problem, which includes the exact input required to reproduce the problem. This allows other people to easily test your program, by simply using copy&paste. – Andreas Wenzel Apr 24 '23 at 18:13
  • 1
    Please edit your _question_ and post a sample `users.csv` file in a separate code block here [so we may download it along with your program if we wish]. The data should be enough to cause the fault. – Craig Estey Apr 24 '23 at 18:25

1 Answers1

2

A few issues ...

  1. You have a typedef struct { ... } USER[100];.
  2. You want: typedef struct { ... } USER; because, later, you do: USER U[100]; You can't [well, _shouldn't] have both.
  3. I think you've got some confusion about struct scalars (single variables) vs. struct arrays and tried to make everything an array of some sort.
  4. Your loop to read in the data uses feof. This has issues. It didn't stop after the last line. This would cause a segfault (the error you got) because the buffer only had one token [from the unchanged buffer from the previous loop iteration] and the second strtok call would return NULL.
  5. You're using fscanf to read in the buffer. Better to use fgets to get the line and strcspn to strip the newline.
  6. The fscanf would fail if (e.g.) the occupation had a space in it such as: Chief Executive Officer
  7. You are passing MAX as the count to sort_users and print_students. This should be i--the actual number of users read in.
  8. In your sort function, you were doing: T[j]->id = U[0]->id. T only has one element so T[j] is undefined behavior (and causes a second segfault). (i.e.) The swap code is broken.
  9. In the sort, You're iterating j backwards. This might be a valid way to do a sort, but I think it's problematic.
  10. You most certainly can use #define MAX 100

Here is corrected version.

  1. It is annotated with the bugs and fixes.
  2. After changing the typedef [as mentioned above], I changed all (e.g.) U[i]->id into U->id because I made U a pointer to the current element.
  3. I added some helper macros to iterate through the arrays (e.g. FORUSERS and FORALLUSERS).
/*
Write a program that takes in input a file with data relative to users from
a streaming site, loads them in memory in a structured type variable and,
after having sorted them by age in ascending order, writes in a new file
the data relative to only the users that are students.
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#if 0
// NOTE/BUG: whatever error you got, it was _not_ from the #define
const int MAX = 100;                    // I wasn't able to use a preprocessor command (such as '#define MAX 100') | error: expected identifier or '(' before numeric constant | so I declared a constant global variable
#else
#define MAX     100
#endif

typedef struct {
    int id,
     age,
     watched_movies;
    char occupation[30];
#if 0
// NOTE/BUG: we do _not_ the typedef to define an _array_ of structs -- only a
// single instance
} USER[100];                            // can't initialize with MAX ( ==> cannot initialize arrays with variable sized elements)
#else
} USER;
#endif

// helper macros to loop through arrays of USER structs
#define FORUSERS(u_,lo_,hi_) \
    USER *u_ = &ulist[lo_];  u_ < &ulist[hi_];  ++u_
#define FORALLUSERS(u_) \
    FORUSERS(u_,0,max)

//Written 'MAX' as 'max' to avoid variable shadowing and confusion in function building
void sort_users(USER * U, int max);
void print_students(USER * U, int max);

int
main(void)
{
    FILE *usersPtr;
    int max = MAX;

    if ((usersPtr = fopen("users.csv", "r")) == NULL) {
        puts("File 'users.csv' could not be opened.");
        exit(1);
    }
    else {
#if 0
// NOTE/BUG: because of the original typedef, this defines 10,000 elements!
        USER userlist[100] = { {{0, 0, 0, ""}
            }
// NOTE/BUG: the compiler warnings are your _friend_
        }; // added braces to remove compiler warning
#else
        USER ulist[MAX];
#endif
        for (FORALLUSERS(U)) {
            U->id = 99999;
            U->age = 0;
            U->watched_movies = 0;
            strcpy(U->occupation, "");
        }

#if 0
        int max = 0;
        char string[60] = { "" };
#else
        char string[1000];
#endif
        char *strPtr;

#if 0
// NOTE/BUG: this does _not_ stop after the last line and causes a segfault
// on the strtok for id
        while (!feof(usersPtr)) {
            fscanf(usersPtr, "%s", string);
#else
        max = 0;
        while (fgets(string,sizeof(string),usersPtr) != NULL) {
            string[strcspn(string,"\n")] = 0;
#endif

            USER *U = &ulist[max];

            strPtr = strtok(string, ";");
            U->id = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            U->age = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            U->watched_movies = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            strcpy(U->occupation, strPtr);

            max++;
        }
        strPtr = NULL;

#if 0
// NOTE/BUG: we must only pass the actual number of users read in
        sort_users(U, MAX);
        print_students(U, MAX);
#else
        sort_users(ulist, max);
        print_students(ulist, max);
#endif

        printf("%s", "The 'students.csv' file has been created and filled with the relevant pieces of information.\n");

        fclose(usersPtr);
        usersPtr = NULL;
    }

    return 0;
}

void
sort_users(USER *ulist, int max)            // ("bubble sorting")
{
    for (FORALLUSERS(Ui)) {
#if 0
// NOTE/BUG: while this may be valid, it is confusing [to me ;-)]
        for (int j = max - 1; j != i; --j) {
#else
        for (FORUSERS(Uj,(Ui - ulist) + 1,max)) {
#endif

#if 0
// NOTE/BUG: since I've changed the for loops, this test is reversed
            if (Uj->age <= Ui->age) {
                USER T[1] = { {{0, 0, 0, ""}
                    }
                };                      // create a temporary array of structs (of 1 element)
#else
            if (Uj->age > Ui->age) {
                USER T;
#endif

#if 0
// NOTE/BUG: this is a bit convoluted
                T[0]->id = Ui->id;
                T[0]->age = Ui->age;
                T[0]->watched_movies = Ui->watched_movies;
                strcpy(T[0]->occupation, Ui->occupation);

                T[i]->id = Uj->id;
                T[i]->age = Uj->age;
                T[i]->watched_movies = Uj->watched_movies;
                strcpy(T[i]->occupation, Uj->occupation);

// NOTE/BUG: T[j] goes beyond the end of T
                T[j]->id = U[0]->id;
                T[j]->age = U[0]->age;
                T[j]->watched_movies = U[0]->watched_movies;
                strcpy(T[j]->occupation, U[0]->occupation);
#else
                T = *Ui;
                *Ui = *Uj;
                *Uj = T;
#endif
            }
        }
    }
}

void
print_students(USER *ulist, int max)
{
    FILE *studentsPtr;

    if ((studentsPtr = fopen("students.csv", "w")) == NULL) {
        puts("File 'students.csv' could not be created.");
        exit(1);
    }
    else {
        for (FORALLUSERS(U)) {
            if (strcasecmp(U->occupation, "student") == 0) {
                fprintf(studentsPtr, "%d,%d,%d,%s\n",
                    U->id, U->age, U->watched_movies, U->occupation);
            }
        }
        fclose(studentsPtr);
        studentsPtr = NULL;
    }
}

In the code above, I've used cpp conditionals to denote old vs. new code:

#if 0
// old code
#else
// new code
#endif

#if 1
// new code
#endif

Note: this can be cleaned up by running the file through unifdef -k


Here is the users.csv I created:

11111;23;5;Programmer
22222;37;6;Doctor
33333;43;7;Lawyer
44444;17;82;Student
55555;12;197;Student

Here is the students.csv that the program created:

55555,12,197,Student
44444,17,82,Student

Here is the cleaned up code (run through unifdef -k):

/*
Write a program that takes in input a file with data relative to users from
a streaming site, loads them in memory in a structured type variable and,
after having sorted them by age in ascending order, writes in a new file
the data relative to only the users that are students.
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX     100

typedef struct {
    int id,
     age,
     watched_movies;
    char occupation[30];
} USER;

// helper macros to loop through arrays of USER structs
#define FORUSERS(u_,lo_,hi_) \
    USER *u_ = &ulist[lo_];  u_ < &ulist[hi_];  ++u_
#define FORALLUSERS(u_) \
    FORUSERS(u_,0,max)

//Written 'MAX' as 'max' to avoid variable shadowing and confusion in function building
void sort_users(USER * U, int max);
void print_students(USER * U, int max);

int
main(void)
{
    FILE *usersPtr;
    int max = MAX;

    if ((usersPtr = fopen("users.csv", "r")) == NULL) {
        puts("File 'users.csv' could not be opened.");
        exit(1);
    }
    else {
        USER ulist[MAX];
        for (FORALLUSERS(U)) {
            U->id = 99999;
            U->age = 0;
            U->watched_movies = 0;
            strcpy(U->occupation, "");
        }

        char string[1000];
        char *strPtr;

        max = 0;
        while (fgets(string,sizeof(string),usersPtr) != NULL) {
            string[strcspn(string,"\n")] = 0;

            USER *U = &ulist[max];

            strPtr = strtok(string, ";");
            U->id = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            U->age = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            U->watched_movies = atoi(strPtr);

            strPtr = strtok(NULL, ";");
            strcpy(U->occupation, strPtr);

            max++;
        }
        strPtr = NULL;

        sort_users(ulist, max);
        print_students(ulist, max);

        printf("%s", "The 'students.csv' file has been created and filled with the relevant pieces of information.\n");

        fclose(usersPtr);
        usersPtr = NULL;
    }

    return 0;
}

void
sort_users(USER *ulist, int max)            // ("bubble sorting")
{
    for (FORALLUSERS(Ui)) {
        for (FORUSERS(Uj,(Ui - ulist) + 1,max)) {

            if (Uj->age > Ui->age) {
                USER T;

                T = *Ui;
                *Ui = *Uj;
                *Uj = T;
            }
        }
    }
}

void
print_students(USER *ulist, int max)
{
    FILE *studentsPtr;

    if ((studentsPtr = fopen("students.csv", "w")) == NULL) {
        puts("File 'students.csv' could not be created.");
        exit(1);
    }
    else {
        for (FORALLUSERS(U)) {
            if (strcasecmp(U->occupation, "student") == 0) {
                fprintf(studentsPtr, "%d,%d,%d,%s\n",
                    U->id, U->age, U->watched_movies, U->occupation);
            }
        }
        fclose(studentsPtr);
        studentsPtr = NULL;
    }
}

UPDATE:

Thank you! Would you mind telling me more about these macros? They are not a concept I am familiar with, yet. – Gisvaldo

Although there are many references on macros with arguments, here is one: https://www.math.utah.edu/docs/info/cpp_toc.html#SEC12

Unlike macros in [some] assemblers, we can't get a clean .lst file from C that shows the post-processed output. But, we can give cc and/or cpp the -E and -P options to see the output.

But, for meaningful compares, we need to do a fair amount of cleanup on the input and output files so that a diff has some meaning.

Here is the diff that shows the expansion of the FORUSERS and FORALLUSERS macros. As a bonus(?), we also see the effect of the MAX macro:

1d0
< #define MAX       100
8,11d6
< #define FORUSERS(u_,lo_,hi_) \
<   USER *u_ = &ulist[lo_];  u_ < &ulist[hi_];  ++u_
< #define FORALLUSERS(u_) \
<   FORUSERS(u_,0,max)
18c13
<   int max = MAX;
---
>   int max = 100;
24,25c19,20
<       USER ulist[MAX];
<       for (FORALLUSERS(U)) {
---
>       USER ulist[100];
>       for (USER *U = &ulist[0]; U < &ulist[max]; ++U) {
59,60c54,55
<   for (FORALLUSERS(Ui)) {
<       for (FORUSERS(Uj, (Ui - ulist) + 1, max)) {
---
>   for (USER *Ui = &ulist[0]; Ui < &ulist[max]; ++Ui) {
>       for (USER *Uj = &ulist[(Ui - ulist) + 1]; Uj < &ulist[max]; ++Uj) {
79c74
<       for (FORALLUSERS(U)) {
---
>       for (USER *U = &ulist[0]; U < &ulist[max]; ++U) {

To either further your education [or confusion ;-)], below is an enhanced example:

  1. In the above, the name of the list was hardwired into the macros as ulist. In the code below, it is now an argument in the macro.
  2. We were using and passing around two variables: ulist and max
  3. The maximum count of the user list array was fixed at MAX (e.g. 100)
  4. If we create a "list" struct (e.g. ULIST below), we can implement the list as a dynamic array that can grow to any size. And, we can eliminate MAX altogether.
  5. Now, we only need pass a single argument [the list pointer] to the functions.
  6. As a byproduct of using such a dynamic list, it eliminates many of your initial problems.
// Write a program that takes in input a file with data relative to users from
// a streaming site, loads them in memory in a structured type variable and,
// after having sorted them by age in ascending order, writes in a new file
// the data relative to only the users that are students.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// definition of user
typedef struct {
    int id,
     age,
     watched_movies;
    char occupation[30];
} USER;

// dynamic list of users
typedef struct {
    USER *data;                         // pointer to users in list
    int max;                            // number of users in list
} ULIST;

// helper macros to loop through arrays of USER structs
#define FORUSERS(u_,ul_,lo_,hi_) \
    USER *u_ = &ul_->data[lo_];  u_ < &ul_->data[hi_];  ++u_
#define FORALLUSERS(u_,ul_) \
    FORUSERS(u_,ul_,0,ul_->max)

void sort_users(ULIST *ulist);
void print_students(ULIST *ulist);
USER *ulistnew(ULIST *ulist);

int
main(void)
{
    FILE *usersPtr;

    if ((usersPtr = fopen("users.csv", "r")) == NULL) {
        puts("File 'users.csv' could not be opened.");
        exit(1);
    }

    // get an empty list
    ULIST *ulist = calloc(1,sizeof(*ulist));

    char string[1000];
    char *strPtr;

    while (fgets(string,sizeof(string),usersPtr) != NULL) {
        string[strcspn(string,"\n")] = 0;

        USER *U = ulistnew(ulist);

        strPtr = strtok(string, ";");
        U->id = atoi(strPtr);

        strPtr = strtok(NULL, ";");
        U->age = atoi(strPtr);

        strPtr = strtok(NULL, ";");
        U->watched_movies = atoi(strPtr);

        strPtr = strtok(NULL, ";");
        strcpy(U->occupation, strPtr);
    }
    strPtr = NULL;

    sort_users(ulist);
    print_students(ulist);

    printf("The 'students.csv' file has been created and filled"
    " with the relevant pieces of information.\n");

    fclose(usersPtr);
    usersPtr = NULL;

    return 0;
}

void
sort_users(ULIST *ulist)
{
    for (FORALLUSERS(Ui,ulist)) {
        size_t uidx = Ui - ulist->data;
        for (FORUSERS(Uj,ulist,uidx + 1,ulist->max)) {
            if (Uj->age > Ui->age) {
                USER T = *Ui;
                *Ui = *Uj;
                *Uj = T;
            }
        }
    }
}

void
print_students(ULIST *ulist)
{
    FILE *studentsPtr;

    if ((studentsPtr = fopen("students.csv", "w")) == NULL) {
        puts("File 'students.csv' could not be created.");
        exit(1);
    }

    for (FORALLUSERS(U,ulist)) {
        if (strcasecmp(U->occupation, "student") == 0) {
            fprintf(studentsPtr, "%d,%d,%d,%s\n",
                U->id, U->age, U->watched_movies, U->occupation);
        }
    }

    fclose(studentsPtr);
    studentsPtr = NULL;
}

USER *
ulistnew(ULIST *ulist)
{
    // get index of new element and increment the count of users
    int uidx = ulist->max++;

    // increase the size of the array of users
    ulist->data = realloc(ulist->data,sizeof(*ulist->data) * ulist->max);

    // fault out on no memory
    if (ulist->data == NULL) {
        perror("realloc");
        exit(1);
    }

    // point to the new entry
    USER *U = &ulist->data[uidx];

    return U;
}
Craig Estey
  • 30,627
  • 4
  • 24
  • 48
  • Thank you! Would you mind telling me more about these macros? They are not a concept I am familiar with, yet. – Gisvaldo Apr 25 '23 at 07:09
  • 1
    @Gisvaldo I've added an update that may help you with the macros. I also created an enhanced version of the program that is more of what I would have done. _Be careful what you wish for--you may actually get it ..._ :-) – Craig Estey Apr 25 '23 at 21:17
  • You have been most helpful. One question though: wouldn't the condition `((strcasecmp(U->occupation, "student") == 0)` - in the function `print_students()` - also be met if one's occupation simply started with the letter `s` since, per my understanding, the `strcasecmp` returns the integer with the location of the first matching case between the 2 arrays? – Gisvaldo Apr 26 '23 at 18:13
  • 1
    @Gisvaldo Oops, yes, you are correct. I got ahead of myself. I had been thinking about: `if (strcasestr(U->occupation,"student") != NULL)` This [new version] would match fields such as: `graduate student`, `student of the arts`, or `I am a student` whereas the `strcasecmp` would _only_ match `Student` But, `strcasecmp` is a _comparison_, so `strcasecmp(stra,strb)` returns `-1-->stra < strb, 0-->stra == strb, +1-->stra > strb`. (i.e. it's suitable as a function passed to `qsort` amongst other things). – Craig Estey Apr 26 '23 at 18:43