-2

I don't know what is wrong with the following function that reads an arbitrary long text line:

char *GetLine(FILE * f) {
    size_t size = 0;
    size_t len  = 0;
    size_t last = 0;
    char *buf = NULL;
    bool line_end = false;

    while (!feof(f) && !line_end) {
        printf("[GetLine] size = %ld, BUFSIZE = %d\n", size, BUFSIZ);
        size += BUFSIZ;
        buf = realloc(buf, size);
        assert(buf);
        if (fgets(buf + last, (int) size, f) == NULL)
            return NULL;
        len = strlen(buf);
        // overwrite '\0' at the end of the string that fgets put
        last = len - 1;
        if (last >= 0 && buf[last] == '\n')
            line_end = true;
    }

    return buf;
}

My test client is simple:

int main() {
    char *line;

    line = GetLine(stdin);

    return 0;
}

It works just fine with not too long lines (like those of length below 8,000) but it cracks for a line of length ca 16,000. My BUFSIZE is 8192

Here is what Valgrind reports:

==14413== WARNING: new redirection conflicts with existing -- ignoring it
--14413--     old: 0x04017ca0 (strlen              ) R-> (0000.0) 0x38075d61 ???
--14413--     new: 0x04017ca0 (strlen              ) R-> (2007.0) 0x04c2c730 strlen
--14413-- REDIR: 0x4017a50 (ld-linux-x86-64.so.2:index) redirected to 0x4c2c2e0 (index)
--14413-- REDIR: 0x4017c70 (ld-linux-x86-64.so.2:strcmp) redirected to 0x4c2d880 (strcmp)
--14413-- REDIR: 0x40189a0 (ld-linux-x86-64.so.2:mempcpy) redirected to 0x4c30330 (mempcpy)
--14413-- Reading syms from /lib64/libc-2.19.so
--14413-- REDIR: 0x4eba7f0 (libc.so.6:strcasecmp) redirected to 0x4a23770 (_vgnU_ifunc_wrapper)
--14413-- REDIR: 0x4ebcae0 (libc.so.6:strncasecmp) redirected to 0x4a23770 (_vgnU_ifunc_wrapper)
--14413-- REDIR: 0x4eb9f70 (libc.so.6:memcpy@GLIBC_2.2.5) redirected to 0x4a23770 (_vgnU_ifunc_wrapper)
--14413-- REDIR: 0x4eb82f0 (libc.so.6:rindex) redirected to 0x4c2bfc0 (rindex)
--14413-- REDIR: 0x4ec1180 (libc.so.6:strchrnul) redirected to 0x4c2ff40 (strchrnul)
[GetLine] size = 0, BUFSIZE = 8192
--14413-- REDIR: 0x4eb0fd0 (libc.so.6:realloc) redirected to 0x4c2b3a0 (realloc)
--14413-- REDIR: 0x4eb9640 (libc.so.6:memchr) redirected to 0x4c2d920 (memchr)
--14413-- REDIR: 0x4ebf210 (libc.so.6:__GI_memcpy) redirected to 0x4c2e220 (__GI_memcpy)
--14413-- REDIR: 0x4eb65f0 (libc.so.6:strlen) redirected to 0x4c2c670 (strlen)
[GetLine] size = 8192, BUFSIZE = 8192
==14413== Invalid write of size 1
==14413==    at 0x4C2E4C3: __GI_memcpy (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14413==    by 0x4E9F542: _IO_getline_info (in /lib64/libc-2.19.so)
==14413==    by 0x4E9E475: fgets (in /lib64/libc-2.19.so)
==14413==    by 0x4007B8: GetLine (in /home/.../alfa_1/src/linetest)
==14413==    by 0x400832: main (in /home/.../alfa_1/src/linetest)
==14413==  Address 0x51e3080 is 0 bytes after a block of size 16,384 alloc'd
==14413==    at 0x4C2B41E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14413==    by 0x400777: GetLine (in /home/.../alfa_1/src/linetest)
==14413==    by 0x400832: main (in /home/.../alfa_1/src/linetest)
==14413== 

valgrind: m_mallocfree.c:304 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
valgrind: Heap block lo/hi size mismatch: lo = 16448, hi = 3903549025615949881.
This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata.  If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away.  Please try that before reporting this as a bug.


host stacktrace:
==14413==    at 0x3805D3B6: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==14413==    by 0x3805D4E4: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==14413==    by 0x3805D666: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==14413==    by 0x3806A433: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==14413==    by 0x38056A8B: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==14413==    by 0x3805556B: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==14413==    by 0x380593DB: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==14413==    by 0x38054B67: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==14413==    by 0x808C61F58: ???
==14413==    by 0x808B99EEF: ???

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==14413==    at 0x4E9E4E2: fgets (in /lib64/libc-2.19.so)
==14413==    by 0x4007B8: GetLine (in /home/.../alfa_1/src/linetest)
==14413==    by 0x400832: main (in /home/.../alfa_1/src/linetest)

It looks like some heap allocation problem (I suspect realloc) but I can't see the exact source of it

micsza
  • 789
  • 1
  • 9
  • 16
  • 2
    To begin with, see [Why is while(!feof(file)) always wrong?](https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong]). – John Bollinger May 26 '17 at 13:05
  • 1
    Two things: [Don't use `while (!feof(...))`](http://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong), and don't assign back to the same pointer you pass to `realloc` (think what happens if `realloc` returns `NULL`). – Some programmer dude May 26 '17 at 13:05
  • Doesn't `last >= 0` provoke a compiler warning? (Or did you forget to enable compiler warnings?) Since `last` is unsigned, the comparison must be true, which will cause a problem if `len` is 0. Also, if `fgets` returns `NULL`, you just return. That leaks memory and effectively discards the current line, so if your input does not end with a newline, the last line will be dropped into the bitbucket. That's not strictly speaking wrong -- text files should not have unterminated lines -- but it might be surprising. – rici May 26 '17 at 14:40

1 Answers1

4

The problem is here:

fgets(buf + last, (int) size, f)

If last is non-zero then the fgets call can write out of bounds of your buffer. You need to read at most size - last bytes.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621