2

This is my very first post on stackoverflow. I am a CS student learning C, and I am having some issues with the problem I'm working on. Additionally, I should mention that I know very little, so if anything I put here comes off as foolish or ignorant, it is absolutely not my intention

I am aware that there are other posts similar to this one, however so far I feel that I have tried making a lot of amendments that all end with the same result.

I am given a text file in which each line contains studentName(tab)gpa. The total size of the file is unknown, this I must use dynamic memory allocation.

Example of text file format

Jordan  4.0
Bhupesh 2.51

General steps for program

Many details will be left out to save myself from embarrassment, however I will give a high-level overview of the process I am struggling with:

 1.) Create dynamic memory array to hold struct for each line
 2.) Start looping through file
 3.) check the current size of the array to see if reallocation is necessary
 4.) Create dynamic array to hold name
 5.) Place name and gpa into struct
 6.) rinse & repeat

Finally, one last thing. The error occurs when my initial allocated memory limit is reached and the program attempts to reallocate more memory from the heap.

Screenshot of error being thrown in clion debugger

My code is shown below:

#define EXIT_CODE_FAIL 1
#define ROW_COUNT 10
#define BUFFER_SIZE 255
#define VALID_ARG_COUNT 2

struct Student {
    float gpa;
    char * name;
};

// read the file, pack contents into struct array
struct Student * readFileContents(char *filename, int *rowCounter) {

    // setup for loop
    int maxDataSize = ROW_COUNT;
    float currentStudentGpa = 0;
    char studentNameBuffer[BUFFER_SIZE];

    // initial structArray pre-loop
    struct Student * structArray = calloc(maxDataSize, sizeof(*structArray));

    FILE *pFile = fopen(filename, "r");
    validateOpenFile(pFile);


    // loop through, get contents, of eaach line, place them in struct
    while (fscanf(pFile, "%s\t%f", studentNameBuffer, &currentStudentGpa) > 0) {
        structArray = checkArraySizeIncrease(*rowCounter, &maxDataSize, &structArray);
        structArray->name = trimStringFromBuffer(studentNameBuffer);
        structArray->gpa = currentStudentGpa;
        (*rowCounter)++, structArray++;
    }

    fclose(pFile);

    return structArray;
}

// resize array if needed
struct Student * checkArraySizeIncrease(int rowCount, int * maxDataSize, struct Student ** structArray) {

    if (rowCount == *maxDataSize) {
        *maxDataSize += ROW_COUNT;
        
        **// line below is where the error occurs** 
        struct Student * newStructArray = realloc(*structArray, *maxDataSize * sizeof(*newStructArray));
        validateMalloc(newStructArray);

        return newStructArray;
    }
    return *structArray;
}

// resize string from initial data buffer
char *trimStringFromBuffer(char *dataBuffer) {

    char *string = (char *) calloc(strlen(dataBuffer), sizeof(char));
    validateMalloc(string);
    strcpy(string, dataBuffer);

    return string;
}


Once again, I apologize if similar questions have been asked, but please know I have tried most of the recommendations that I have found on stack overflow with no success (of which I'm well aware is the result of my poor programming skill level in C).

I will now promptly prepare myself for my obligatory "first post on stackoverflow" roasting. Cheers!

  • Don't use realloc. (realloc isn't guaranteed to work in every case.) Use another malloc instead. – paladin Mar 11 '21 at 22:01
  • In `checkArraySizeIncrease`, you're _not_ setting `*structArray` if you increase the array with `realloc`. Change: `return newStructArray;` into `*structArray = newStructArray;` – Craig Estey Mar 11 '21 at 22:02
  • 2
    @paladin Umm, `realloc` works fine in most cases. If it fails, `malloc` probably will too. Using another `malloc` would leak memory. – Craig Estey Mar 11 '21 at 22:03
  • Instead of defining your own EXIT_CODE_FAIL, use EXIT_FAILURE and EXIT_SUCCESS defined in – William Pursell Mar 11 '21 at 22:06
  • As to the "roasting", I believe you avoided it by being "sufficiently" obsequious :-) And, posting a short/complete question ... But, "groveling" never hurts ... :-) – Craig Estey Mar 11 '21 at 22:07
  • @CraigEstey that's not right. realloc has a much higher chance to fail than malloc. And calling several mallocs won't leak memory if you check for null pointer. Btw JordanGunn92 can't you use an OS call to determine the file size and allocate enough memory in the first place? – paladin Mar 11 '21 at 22:08
  • 2
    If you increment `structArray`, you cannot `realloc` it. You need to call `realloc` on the memory location that was originally alloced. Keep a reference to the beginning, and increment a different pointer. – William Pursell Mar 11 '21 at 22:11
  • 1
    allocation in `calloc(strlen(dataBuffer), sizeof(char));` is missing 0-terminator, you risk buffer overflows there – tstanisl Mar 11 '21 at 22:11
  • @paladin _realloc has a much higher chance to fail than malloc_ ... Why do you think that? – Craig Estey Mar 11 '21 at 22:14
  • @CraigEstey malloc can only allocate a continues block of memory. If the CPU allocates another block of memory right behind your allocated block of memory, there is not much space left for enlarging your block of memory with realloc. So it's better to allocate a new block of memory with malloc and to keep track of all pointers. The best scenario is to allocate enough memory in the first place. – paladin Mar 11 '21 at 22:28
  • @CraigEstey PS if you really need to use several mallocs, while not knowing the amount of memory you'll need, it's a good practice to double each malloc memory allocation. So for example, you allocate 2mb in the first call, shouldn't that be enough, you allocate 4mb, then 8mb, 16mb and etc.. – paladin Mar 11 '21 at 22:34
  • 1
    @paladin You really should stop. What you think is a panacea is what realloc already does. I get the impression that you believe that if `realloc` is unable to enlarge the current area that it will fail. That is not true. It will simply do a malloc, copy, free so there is no reason to not use `realloc` for dynamic arrays as being used here. – Craig Estey Mar 11 '21 at 22:42
  • @CraigEstey Oh, I didn't knew that realloc is also able to copy. I just never used it. But to be honest, copying data from one memory location to another is just stupid. – paladin Mar 11 '21 at 22:57
  • 1
    Welcome to Stack Overflow. You get a vote for a well-formatted and well-asked question (no slacking off in the future...) – David C. Rankin Mar 12 '21 at 01:08
  • @WilliamPursell thanks for pointing this out! I've been using this macro for a while and had no idea it already existed. Also thank you everyone SO much for weighing in the topic, I really appreciate everyone's help and patience! – JordanGunn92 Mar 12 '21 at 07:57

3 Answers3

2

You are reusing structArray as both the base of the array and a pointer to the current element. This won't work. We need two variables.

There are a number of "loose" variables related to the dynamic array. It's cleaner to define a struct (e.g. dynarr_t below) to contain them and pass just the struct pointer around.

When you're duplicating the string, you must allocate strlen + 1 [not just strlen]. But, the entire function does what strdup already does.

I tried to save as much as possible, but I've had to refactor the code a fair bit to incorporate all the necessary changes.

By passing sizeof(*structArray) to the arrnew function, this allows the struct to be used for arbitrary size array elements.

Anyway, here's the code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

#define sysfault(_fmt...) \
    do { \
        printf(_fmt); \
        exit(1); \
    } while (0)

#define EXIT_CODE_FAIL 1
#define ROW_COUNT 10
#define BUFFER_SIZE 255
#define VALID_ARG_COUNT 2

struct Student {
    float gpa;
    char *name;
};

// general dynamic array control
typedef struct {
    void *base;                         // base address
    size_t size;                        // bytes in array element
    size_t count;                       // current number of used entries
    size_t max;                         // maximum number of entries
    size_t grow;                        // number of entries to grow
} dynarr_t;

// arrfind -- return pointer to array element
void *
arrfind(dynarr_t *arr,size_t idx)
{
    void *ptr;

    ptr = arr->base;
    idx *= arr->size;
    ptr += idx;

    return ptr;
}

// arrnew -- create new array control
dynarr_t *
arrnew(size_t siz,size_t grow)
// siz -- sizeof of array element
// grow -- number of elements to grow
{
    dynarr_t *arr;

    arr = calloc(1,sizeof(*arr));
    if (arr == NULL)
        sysfault("arrnew: calloc fail -- %s\n",strerror(errno));

    arr->size = siz;
    arr->grow = grow;

    return arr;
}

// arrgrow -- grow array [if necessary]
// RETURNS: pointer to element to fill
void *
arrgrow(dynarr_t *arr)
{
    void *ptr;

    // grow array if necessary
    // NOTE: use of a separate "max" from "count" reduces the number of realloc
    // calls
    if (arr->count >= arr->max) {
        arr->max += arr->grow;
        arr->base = realloc(arr->base,arr->size * arr->max);
        if (arr->base == NULL)
            sysfault("arrgrow: realloc failure -- %s\n",strerror(errno));
    }

    // point to current element
    ptr = arrfind(arr,arr->count);

    // advance count of elements
    ++arr->count;

    return ptr;
}

// arrtrim -- trim array to actual number of elements used
void
arrtrim(dynarr_t *arr)
{

    arr->base = realloc(arr->base,arr->size * arr->count);
    if (arr->base == NULL)
        sysfault("arrtrim: realloc failure -- %s\n",strerror(errno));

    arr->max = arr->count;
}

void
validateMalloc(void *ptr)
{

    if (ptr == NULL) {
        perror("validateMalloc");
        exit(1);
    }
}

void
validateOpenFile(FILE *ptr)
{

    if (ptr == NULL) {
        perror("validateOpenFile");
        exit(1);
    }
}

// resize string from initial data buffer
char *
trimStringFromBuffer(char *dataBuffer)
{

#if 0
#if 0
    char *string = calloc(1,strlen(dataBuffer));
#else
    char *string = calloc(1,strlen(dataBuffer) + 1);
#endif
    validateMalloc(string);
    strcpy(string, dataBuffer);
#else
    char *string = strdup(dataBuffer);
    validateMalloc(string);
#endif

    return string;
}

// read the file, pack contents into struct array
dynarr_t *
readFileContents(char *filename)
{
    dynarr_t *arr;

    // setup for loop
    float currentStudentGpa = 0;
    char studentNameBuffer[BUFFER_SIZE];
    struct Student *structArray;

    arr = arrnew(sizeof(*structArray),10);

    FILE *pFile = fopen(filename, "r");
    validateOpenFile(pFile);

    // loop through, get contents, of eaach line, place them in struct
    while (fscanf(pFile, "%s\t%f", studentNameBuffer, &currentStudentGpa) > 0) {
        structArray = arrgrow(arr);
        structArray->name = trimStringFromBuffer(studentNameBuffer);
        structArray->gpa = currentStudentGpa;
    }

    fclose(pFile);

    arrtrim(arr);

    return arr;
}
Craig Estey
  • 30,627
  • 4
  • 24
  • 48
  • I always feel a bit uncomfortable using the `do { ... } while;` macro wrap unless I'm sure it will be accepted on the OPs compiler. I'm unfamiliar with whether clion supports it or not, so it may be 100% okay here. – David C. Rankin Mar 12 '21 at 01:10
  • @DavidC.Rankin It's 100% straight C--no magic. It can be controversial only because some people use `do { ... } while (i < 10);` and _expect_ it to _loop_. In all the places I've worked, only _one_ colleague at _one_ company had a psychological problem with it. Here, on SO, it's understood. In fact, suggested by others. See [recent]: https://stackoverflow.com/questions/66573666/macro-to-do-fgets-stripping-newline/66574652#66574652 At the top, others were suggesting `do { } while (0);` – Craig Estey Mar 12 '21 at 01:22
  • @DavidC.Rankin See my other answers on this: https://stackoverflow.com/questions/60704120/isnt-the-time-complexity-of-my-code-onm and https://stackoverflow.com/questions/50656651/about-the-exclusiveness-of-the-cases-of-an-if-block/50685506#50685506 – Craig Estey Mar 12 '21 at 01:23
  • I don't know why, or which compiler it was, but it would not take a `do { ... } while (0)` wrapped macro. I am thinking it was VS10, but I may be off there. You got my vote regardless `:)` – David C. Rankin Mar 12 '21 at 06:09
  • @CraigEstey Thank you so so so much for taking the time to do this. This was so unbelievably helpful. I spent an excessive amount of time understanding this. What an amazing way to utilize structs. Once gain I appreciate the help so much, you are a hero! – JordanGunn92 Mar 12 '21 at 07:52
  • @CraigEstey Also, thank you for pointing out that strdup does what I was doing already in trimStringFromBuffer() , was very useful and helped me cut back on the amount of code. – JordanGunn92 Mar 12 '21 at 08:00
0

I think your issue is with the calculation of the size of the realloc. Rather than using sizeof(*newStructArray), shouldn't you really be using the size of your pointer type? I would have written this as realloc(*structArray, *maxDataSize * sizeof(struct Student *))

There's a lot of other stuff in here I would never do - passing all those variables in to checkArraySizeIncrease as pointers is generally a bad idea because it can mask the fact that things are getting changed, for instance.

tsc_chazz
  • 206
  • 1
  • 3
0

There is an issue in allocation of the buffer for string

char *string = (char *) calloc(strlen(dataBuffer), sizeof(char));

it should be:

char *string = (char *) calloc(1 + strlen(dataBuffer), sizeof(char));

as C-strings require extra 0-byte at the end. Without it, the following operation:

strcpy(string, dataBuffer);

may damage data after the buffer, likely messing malloc() metadata.

tstanisl
  • 13,520
  • 2
  • 25
  • 40