-2

I have this C assignment I am a bit struggling at this specific point. I have some background in C, but pointers and dynamic memory management still elude me very much.

The assignment asks us to write a program which would simulate the behaviour of the "uniq" command / filter in UNIX.

But the problem I am having is with the C library functions getline or getdelim (we need to use those functions according to the implementation specifications).

According to the specification, the user input might contain arbitrary amount of lines and each line might be of arbitrary length (unknown at compile-time).

The problem is, the following line for the while-loop while (cap = getdelim(stream.linesArray, size, '\n', stdin))

compiles and "works" somehow when I leave it like that. What I mean by this is that, when I execute the program, I enter arbitrary amount of lines of arbitrary length per each line and the program does not crash - but it keeps looping unless I stop the program execution (whether the lines are correctly stored in " char **linesArray; " are a different story I am not sure about.

I would like to be able to do is something like while ((cap = getdelim(stream.linesArray, size, '\n', stdin)) && (cap != -1))

so that when getdelim does not read any characters at some line (besides EOF or \n) - aka the very first time when user enters an empty line -, the program would stop taking more lines from stdin. (and then print the lines that were stored in stream.linesArray by getdelim).

The problem is, when I execute the program if I make the change I mentioned above, the program gives me "Segmentation Fault" and frankly I don't know why and how should I fix this (I have tried to do something about it so many times to no avail).

For reference:

https://pubs.opengroup.org/onlinepubs/9699919799/functions/getdelim.html

https://en.cppreference.com/w/c/experimental/dynamic/getline

http://man7.org/linux/man-pages/man3/getline.3.html

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define DEFAULT_SIZE 20

typedef unsigned long long int ull_int;

typedef struct uniqStream
{
    char **linesArray;
    ull_int lineIndex;

}   uniq;

int main()
{
    uniq stream = { malloc(DEFAULT_SIZE * sizeof(char)), 0 };

    ull_int cap, i = 0;  
    size_t *size = 0;

    while ((cap = getdelim(stream.linesArray, size, '\n', stdin))) //&& (cap != -1))
    {
        stream.lineIndex = i;

        //if (cap == -1) { break; }
        //print("%s", stream.linesArray[i]);    

        ++i;
        if (i == sizeof(stream.linesArray))
        {
            stream.linesArray = realloc(stream.linesArray, (2 * sizeof(stream.linesArray)));
        }
    }

    ull_int j;
    for (j = 0; j < i; ++j)
    {
        printf("%s\n", stream.linesArray[j]);    
    }

    free(stream.linesArray);
    return 0;
}
  • `uniq stream = { malloc(DEFAULT_SIZE * sizeof(char)), 0 };`<< -- maybe you want `sizeof (char*)`? `if (i == sizeof(stream.linesArray))` <<-- This is the size of a pointer. – wildplasser Apr 03 '19 at 23:32
  • `size_t * size = 0` `getdelim( ... size ...`) you pass a null pointer as the second argument. Do you understand how "returning a value from a function through a pointer" works? Probably it fails with `EINVAL`. – KamilCuk Apr 03 '19 at 23:34
  • Read [*How to debug small programs*](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) – Basile Starynkevitch Apr 03 '19 at 23:38
  • https://stackoverflow.com/a/25823469/841108 is relevant – Basile Starynkevitch Apr 03 '19 at 23:40
  • Remember that `-1` is true in C. Your loop should be along the lines of `while (getdelim(...) > 1) {}` if you want it to end on an empty line, end of input, etc. – Shawn Apr 04 '19 at 00:03

1 Answers1

0

Ok, so the intent is clear - use getdelim to store the lines inside an array. getline itself uses dynamic allocation. The manual is quite clear about it:

getline() reads an entire line from stream, storing the address of the buffer containing the text into *lineptr. The buffer is null-terminated and includes the newline character, if one was found.

The getline() "stores the address of the buffer into *lineptr". So lineptr has to be a valid pointer to a char * variable (read that twice).

*lineptr and *n will be updated to reflect the buffer address and allocated size respectively.

Also n needs to be a valid(!) pointer to a size_t variable, so the function can update it.

Also note that the lineptr buffer:

This buffer should be freed by the user program even if getline() failed.

So what do we do? We need to have an array of pointers to an array of strings. Because I don't like becoming a three star programmer, I use structs. I somewhat modified your code a bit, added some checks. You have the excuse me, I don't like typedefs, so I don't use them. Renamed the uniq to struct lines_s:

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

struct line_s {
    char *line;
    size_t len;
};

struct lines_s {
    struct line_s *lines;
    size_t cnt;
};


int main() {
    struct lines_s lines = { NULL, 0 };

    // loop breaks on error of feof(stdin)
    while (1) {

        char *line = NULL;
        size_t size = 0;
        // we pass a pointer to a `char*` variable
        // and a pointer to `size_t` variable
        // `getdelim` will update the variables inside it
        // the initial values are NULL and 0
        ssize_t ret = getdelim(&line, &size, '\n', stdin);
        if (ret < 0) {
            // check for EOF
            if (feof(stdin)) {
                // EOF found - break
                break;
            }
            fprintf(stderr, "getdelim error %zd!\n", ret);
            abort();
        }

        // new line was read - add it to out container "lines"
        // always handle realloc separately
        void *ptr = realloc(lines.lines, sizeof(*lines.lines) * (lines.cnt + 1));
        if (ptr == NULL) {
            // note that lines.lines is still a valid pointer here
            fprintf(stderr, "Out of memory\n");
            abort();
        }
        lines.lines = ptr;
        lines.lines[lines.cnt].line = line;
        lines.lines[lines.cnt].len = size;
        lines.cnt += 1;


        // break if the line is "stop"
        if (strcmp("stop\n", lines.lines[lines.cnt - 1].line) == 0) {
            break;
        }
    }

    // iterate over lines
    for (size_t i = 0; i < lines.cnt; ++i) {
        // note that the line has a newline in it
        // so no additional is needed in this printf
        printf("line %zu is %s", i, lines.lines[i].line);
    }

    // getdelim returns dynamically allocated strings
    // we need to free them
    for (size_t i = 0; i < lines.cnt; ++i) {
         free(lines.lines[i].line);
    }
    free(lines.lines);
}

For such input:

 line1 line1
 line2 line2
 stop

will output:

 line 0 is line1 line1
 line 1 is line2 line2
 line 2 is stop 

Tested on onlinegdb.

Notes:

  1. if (i == sizeof(stream.linesArray)) sizeof does not magically store the size of an array. sizeof(stream.linesArray) is just sizeof(char**) is just a sizeof of a pointer. It's usually 4 or 8 bytes, depending if on the 32bit or 64bit architecture.

  2. uniq stream = { malloc(DEFAULT_SIZE * sizeof(char)), - stream.linesArray is a char** variable. So if you want to have an array of pointers to char, you should allocate the memory for pointers malloc(DEFAULT_SIZE * sizeof(char*)).

  3. typedef unsigned long long int ull_int; The size_t type if the type to represent array size or sizeof(variable). The ssize_t is sometimes used in posix api to return the size and an error status. Use those variables, no need to type unsigned long long.
  4. ull_int cap cap = getdelim - cap is unsigned, it will never be cap != 1.
KamilCuk
  • 120,984
  • 8
  • 59
  • 111