17

The experiment I am currently working uses a software base with a complicated source history and no well defined license. It would be a considerable amount of work to rationalize things and release under a fixed license.

It is also intended to run a a random unixish platform, and only some of the libc's we support have GNU getline, but right now the code expects it.

Does anyone know of a re-implementation of the GNU getline semantics that is available under a less restrictive license?

Edit:: I ask because Google didn't help, and I'd like to avoid writing one if possible (it might be a fun exercise, but it can't be the best use of my time.)

To be more specific, the interface in question is:

ssize_t getline (char **lineptr, size_t *n, FILE *stream);
Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
dmckee --- ex-moderator kitten
  • 98,632
  • 24
  • 142
  • 234

6 Answers6

20

The code by Will Hartung suffers from a very serious problem. realloc will most probably free the old block and allocate a new one, but the p pointer within the code will continue to point to the original. This one tries to fix that by using array indexing instead. It also tries to more closely replicate the standard POSIX logic.

/* The original code is public domain -- Will Hartung 4/9/09 */
/* Modifications, public domain as well, by Antti Haapala, 11/10/17
   - Switched to getc on 5/23/19 */

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <stdint.h>

// if typedef doesn't exist (msvc, blah)
typedef intptr_t ssize_t;

ssize_t getline(char **lineptr, size_t *n, FILE *stream) {
    size_t pos;
    int c;

    if (lineptr == NULL || stream == NULL || n == NULL) {
        errno = EINVAL;
        return -1;
    }

    c = getc(stream);
    if (c == EOF) {
        return -1;
    }

    if (*lineptr == NULL) {
        *lineptr = malloc(128);
        if (*lineptr == NULL) {
            return -1;
        }
        *n = 128;
    }

    pos = 0;
    while(c != EOF) {
        if (pos + 1 >= *n) {
            size_t new_size = *n + (*n >> 2);
            if (new_size < 128) {
                new_size = 128;
            }
            char *new_ptr = realloc(*lineptr, new_size);
            if (new_ptr == NULL) {
                return -1;
            }
            *n = new_size;
            *lineptr = new_ptr;
        }

        ((unsigned char *)(*lineptr))[pos ++] = c;
        if (c == '\n') {
            break;
        }
        c = getc(stream);
    }

    (*lineptr)[pos] = '\0';
    return pos;
}

The performance can be increased for a platform by locking the stream once and using the equivalent of getc_unlocked(3) - but these are not standardized in C; and if you're using the POSIX version, then you probably will have getline(3) already.

  • I got this errors: `error: invalid conversion from ‘void*’ to ‘char*’ [-fpermissive]` for your `malloc(128)` and `realloc(*lineptr, new_size)`. I fixed it by casting them to `(char*)`: [invalid conversion from `void*' to `char*' when using malloc?](https://stackoverflow.com/questions/5099669/invalid-conversion-from-void-to-char-when-using-malloc) – Evandro Coan May 23 '19 at 15:01
  • When I tested with Cygwin C the performance was 10X worse than the builtin `getline()` – Evandro Coan May 23 '19 at 15:06
  • 1
    As for the performance, that is expected as I am using `fgetc` which needs to lock the stream for each read character. Unfortunately there is no standards-compliant way of avoiding the lock-unlock. There is for POSIX, but if you have POSIX you probably will have getline too. – Antti Haapala -- Слава Україні May 23 '19 at 16:23
  • 2
    @user on *Windows* you can use [_lock_file](https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/lock-file?view=vs-2019) and [_getc_nolock](https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/getc-nolock-getwc-nolock?view=vs-2019) – Antti Haapala -- Слава Україні May 23 '19 at 16:29
15

I'm puzzled.

I looked at the link, read the description, and this is a fine utility.

But, are you saying you simply can't rewrite this function to spec? The spec seems quite clear,

Here:

/* This code is public domain -- Will Hartung 4/9/09 */
#include <stdio.h>
#include <stdlib.h>

size_t getline(char **lineptr, size_t *n, FILE *stream) {
    char *bufptr = NULL;
    char *p = bufptr;
    size_t size;
    int c;

    if (lineptr == NULL) {
        return -1;
    }
    if (stream == NULL) {
        return -1;
    }
    if (n == NULL) {
        return -1;
    }
    bufptr = *lineptr;
    size = *n;

    c = fgetc(stream);
    if (c == EOF) {
        return -1;
    }
    if (bufptr == NULL) {
        bufptr = malloc(128);
        if (bufptr == NULL) {
            return -1;
        }
        size = 128;
    }
    p = bufptr;
    while(c != EOF) {
        if ((p - bufptr) > (size - 1)) {
            size = size + 128;
            bufptr = realloc(bufptr, size);
            if (bufptr == NULL) {
                return -1;
            }
        }
        *p++ = c;
        if (c == '\n') {
            break;
        }
        c = fgetc(stream);
    }

    *p++ = '\0';
    *lineptr = bufptr;
    *n = size;

    return p - bufptr - 1;
}

int main(int argc, char** args) {
    char *buf = NULL; /*malloc(10);*/
    int bufSize = 0; /*10;*/

    printf("%d\n", bufSize);
    int charsRead =  getline(&buf, &bufSize, stdin);

    printf("'%s'", buf);
    printf("%d\n", bufSize);
    return 0;
}

15 minutes, and I haven't written C in 10 years. It minorly breaks the getline contract in that it only checks if the lineptr is NULL, rather than NULL and n == 0. You can fix that if you like. (The other case didn't make a whole lot of sense to me, I guess you could return -1 in that case.)

Replace the '\n' with a variable to implement "getdelim".

Do people still write code any more?

Will Hartung
  • 115,893
  • 19
  • 128
  • 203
  • 12
    This works fine for short strings but may fail after reallocation. bufptr may get a new address and p needs to be kept at the same relative offset. In my tests (with MinGW), realloc may return several times with the same pointer (if there happens to be enough memory at that spot) or may return a new address on the first reallocation. The new address can be near in memory or a ways away, and can also be before the first address as well as after. IE it can make p a random number. To fix, put "offset = p - bufptr;" under the while EOF line, and "p = bufptr + offset;" after the if NULL block. – Todd Apr 29 '10 at 04:52
  • `((p - bufptr) > (size - 1))` is a problem if `size == 0` (and `*lineptr` was uncharacteristically non-NULL) as `size - 1` is a _large_ number. Suggest `((p - bufptr + 1) > size)`. – chux - Reinstate Monica Dec 09 '14 at 15:51
  • malloc and realloc returns on my stdio.h code void* pointers. So I had to add cast operators, also (char*) for the two rows. – jamk Dec 14 '15 at 09:32
  • 3
    You can't return -1 in `size_t`. This will fail miserably if something goes wrong. – Lilith River Mar 31 '16 at 21:31
  • 16
    ***NOTICE THAT THIS `getline` IMPLEMENTATION IS VERY BROKEN*** as pointed out by @Todd. ***DO NOT USE ANYWHERE***. – Antti Haapala -- Слава Україні Nov 10 '17 at 18:22
  • [See the one from my answer instead](https://stackoverflow.com/a/47229318/918959) – Antti Haapala -- Слава Україні Nov 10 '17 at 18:55
  • use ssize_t instead of size_t, and fix the broken "p" around the realloc. – Den-Jason Nov 02 '18 at 12:35
  • 1
    Note that if `realloc()` fails, the memory is leaked. It is not safe to use `oldptr = realloc(oldptr, newsize);` — always use `newptr = realloc(oldptr, newsize); if (newptr == NULL) …error handling…; oldptr = newptr;`. – Jonathan Leffler Apr 08 '20 at 03:15
7

Use these portable versions from NetBSD: getdelim() and getline()

These come from libnbcompat in pkgsrc, and have a BSD license at the top of each file. You need both because getline() calls getdelim(). Fetch the latest versions of both files. See the BSD license at the top of each file. Modify the files to fit into your program: you might need to declare getline() and getdelim() in one of your header files, and modify both files to include your header instead of the nbcompat headers.

This version of getdelim() is portable because it calls fgetc(). For contrast, a getdelim() from a libc (like BSD libc or musl libc) would probably use private features of that libc, so it would not work across platforms.

In the years since POSIX 2008 specified getline(), more Unixish platforms have added the getline() function. It is rare that getline() is missing, but it can still happen on old platforms. A few people try to bootstrap NetBSD pkgsrc on old platforms (like PowerPC Mac OS X), so they want libnbcompat to provide missing POSIX functions like getline().

George Koehler
  • 1,560
  • 17
  • 23
3

If you are compiling for BSD use fgetln instead

Michael Andrews
  • 325
  • 2
  • 4
1

Try using fgets() instead of getline(). I was using getline() in Linux and it was working well until I migrated to Windows. The Visual studio did not recognize getline(). So, I replace the character pointer with character, and EOF with NULL. See below:

#define CHARCOUNT 1000

Before:

char *line = (char*) malloc(CHARCOUNT);
size_t size;
FILE *fp = fopen(file, "r");
while(getline(&line, &size, fp) != -1) {
   ...
}
free(line);

After:

char line[CHARCOUNT];
while(fgets(line, CHARCOUNT, fp) != NULL) {
   ...
}
adil
  • 45
  • 5
  • 1
    This will not handle lines longer than the maximum length passed to `fgets()`, lines longer than that will be split up. So it significantly changes the semantics of the program. Code using `getline()` implicitly expects to be able to read a line of any reasonable length, so replacing `getline()` with `fgets()` is a latent bug. – Andrew Henle Sep 11 '21 at 10:56
0

Better Answer with no Bug Here:

size_t getline(char **lineptr, size_t *n, FILE *stream)
    {
        char *bufptr = NULL;
        char *p = bufptr;
        size_t size;
        int c;
    
        if (lineptr == NULL)
        {
            return -1;
        }
        if (stream == NULL)
        {
            return -1;
        }
        if (n == NULL)
        {
            return -1;
        }
        bufptr = *lineptr;
        size = *n;
    
        c = fgetc(stream);
        if (c == EOF)
        {
            return -1;
        }
        if (bufptr == NULL)
        {
            bufptr = malloc(128);
            if (bufptr == NULL)
            {
                return -1;
            }
            size = 128;
        }
        p = bufptr;
        while (c != EOF)
        {
            if ((p - bufptr) > (size - 1))
            {
                size = size + 128;
                bufptr = realloc(bufptr, size);
                if (bufptr == NULL)
                {
                    return -1;
                }
                p = bufptr + (size - 128);
            }
            *p++ = c;
            if (c == '\n')
            {
                break;
            }
            c = fgetc(stream);
        }
    
        *p++ = '\0';
        *lineptr = bufptr;
        *n = size;
    
        return p - bufptr - 1;
    }
Song
  • 3
  • 3
  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 03 '22 at 23:20