3

I am trying to read an unknown length line from stdin using the C language.

I have seen this when looking on the net:

char** str;
gets(&str);

But it seems to cause me some problems and I don't really understand how it is possible to do it this way.

Can you explain me why this example works/doesn't work and what will be the correct way to implement it (with malloc?)

tux3
  • 7,171
  • 6
  • 39
  • 51
lobengula3rd
  • 1,821
  • 4
  • 26
  • 39
  • the system function 'gets()' has many problems, including the probability of overrunning the available input buffer.. The recommendation is to use fgets() which allows setting a maximum length on the input string. – user3629249 Apr 12 '15 at 00:04
  • suggest using the getline() function. If that function is not already available on your system, the source code can be downloaded off the internet. – user3629249 Apr 12 '15 at 00:06

5 Answers5

6

You don't want a pointer to pointer to char, use an array of chars

char str[128];

or a pointer to char

char *str;

if you choose a pointer you need to reserve space using malloc

str = malloc(128);

Then you can use fgets

fgets(str, 128, stdin);

and remove the trailling newline

char *ptr = strchr(str, '\n');
if (ptr != NULL) *ptr = '\0';

To read an arbitrary long line, you can use getline (a function added to the GNU version of libc):

#define _GNU_SOURCE
#include <stdio.h>

char *foo(FILE * f)
{
    int n = 0, result;
    char *buf;

    result = getline(&buf, &n, f);
    if (result < 0) return NULL;
    return buf;
}

or your own implementation using fgets and realloc:

char *getline(FILE * f)
{
    size_t size = 0;
    size_t len  = 0;
    size_t last = 0;
    char *buf = NULL;

    do {
        size += BUFSIZ; /* BUFSIZ is defined as "the optimal read size for this platform" */
        buf = realloc(buf, size); /* realloc(NULL,n) is the same as malloc(n) */            
        /* Actually do the read. Note that fgets puts a terminal '\0' on the
           end of the string, so we make sure we overwrite this */
        if (buf == NULL) return NULL;
        fgets(buf + last, BUFSIZ, f);
        len = strlen(buf);
        last = len - 1;
    } while (!feof(f) && buf[last] != '\n');
    return buf;
}

Call it using

char *str = getline(stdin);

if (str == NULL) {
    perror("getline");
    exit(EXIT_FAILURE);
}
...
free(str);

More info

David Ranieri
  • 39,972
  • 7
  • 52
  • 94
  • 1
    That doesn't handle a case where the user enters more than 128 characters in a single line. As such, it doesn't address the question of reading a line of arbitrary length. – Peter Apr 11 '15 at 11:00
  • 1
    @Peter: Sure it does. – alk Apr 11 '15 at 11:25
  • 1
    @alk, there is an edit after Peter's comment, Peter, thanks for pointing that out. – David Ranieri Apr 11 '15 at 11:26
  • @alk - the original version I commented on had a single fgets() statement with a buffer of 128 characters, not the getline() function that's there now. – Peter Apr 11 '15 at 11:58
  • 2
    @Alter Mann - realloc() can fail. Your code is now more able to handle an arbitrary length, but is assuming realloc() always succeeds. Apart from that minor quibble, much better now. – Peter Apr 11 '15 at 12:00
  • the function: strlen() gives the offset to the NUL character, so the calculation of len should be strlen(buffer) not strlen(buffer) -1 – user3629249 Apr 12 '15 at 00:09
  • @user3629249, yes, but but `\n` is placed at `len - 1`, `last` is checking for the trailing newline (not for the `NUL` terminator) – David Ranieri Apr 12 '15 at 07:33
  • 2
    I believe that "fgets(buf + last, size, f);" should be "fgets(buf + last, BUFSIZ, f);" else the "realloc" may fail. – Yann-Gaël Guéhéneuc Jul 17 '15 at 01:20
  • As @Yann-GaëlGuéhéneuc, that needs to be `fgets(buf + last, BUFSIZ, f)`, but not because `realloc` may fail. It's because the code as written may overwrite an invalid memory location. – William Pursell Mar 11 '21 at 15:29
3

Firstly, gets() provides no way of preventing a buffer overrun. That makes it so dangerous it has been removed from the latest C standard. It should not be used. However, the usual usage is something like

char buffer[20];
gets(buffer);      /*  pray that user enters no more than 19 characters in a line */

Your usage is passing gets() a pointer to a pointer to a pointer to char. That is not what gets() expects, so your code would not even compile.

That element of prayer reflected in the comment is why gets() is so dangerous. If the user enters 20 (or more) characters, gets() will happily write data past the end of buffer. There is no way a programmer can prevent that in code (short of accessing hardware to electrocute the user who enters too much data, which is outside the realm of standard C).

To answer your question, however, the only ways involve allocating a buffer of some size, reading data in some controlled way until that size is reached, reallocating if needed to get a greater size, and continuing until a newline (or end-of-file, or some other error condition on input) is encountered.

malloc() may be used for the initial allocation. malloc() or realloc() may be used for the reallocation (if needed). Bear in mind that a buffer allocated this way must be released (using free()) when the data is no longer needed - otherwise the result is a memory leak.

Peter
  • 35,646
  • 4
  • 32
  • 74
  • You should really use calloc to dynamically allocate an array – phil Apr 11 '15 at 11:33
  • 1
    It's not necessary to use calloc(). Just as there is no need to set a variable to zero, if the first subsequent action is to set that variable to another value. – Peter Apr 11 '15 at 11:54
1

use the getline() function, this will return the length of the line, and a pointer to the contents of the line in an allocated memory area. (be sure to pass the line pointer to free() when done with it )

user3629249
  • 16,402
  • 1
  • 16
  • 17
0

"Reading an unknown length line from stdin in c with fgets"

Late response - A Windows approach:

The OP does not specify Linux or Windows, but the viable answers posted in response for this question all seem to have the getline() function in common, which is POSIX only. Functions such as getline() and popen() are very useful and powerful but sadly are not included in Windows environments.

Consequently, implementing such a task in a Windows environment requires a different approach. The link here describes a method that can read input from stdin and has been tested up to 1.8 gigabytes on the system it was developed on. (Also described in the link.)_ The simple code snippet below was tested using the following command line to read large quantities on stdin:

cd c:\dev && dir /s  // approximately 1.8Mbyte buffer is returned on my system 

Simple example:

#include "cmd_rsp.h"
int main(void)
{
    char *buf = {0};
    buf = calloc(100, 1);//initialize buffer to some small value
    if(!buf)return 0;
    cmd_rsp("dir /s", &buf, 100);//recursive directory search on Windows system
    printf("%s", buf);
    free(buf);
    
    return 0;
}

cmd_rsp() is fully described in the links above, but it is essentially a Windows implementation that includes popen() and getline() like capabilities, packaged up into this very simple function.

ryyker
  • 22,849
  • 3
  • 43
  • 87
-3

if u want to input an unknown length of string or input try using following code.

#include <stdio.h>
#include <conio.h>
#include <stdlib.h>


int main()
{
char  *m;
clrscr();
printf("please input a string\n");
scanf("%ms",&m);
if (m == NULL)
    fprintf(stderr, "That string was too long!\n");
else
{
    printf("this is the string %s\n",m);
    /* ... any other use of m */
    free(m);


}

getch();
return 0;

}

Note that %ms, %as are GNU extensions..

amdixon
  • 3,814
  • 8
  • 25
  • 34
Pankaj Andhale
  • 401
  • 9
  • 24
  • Unless the string being read, including the added terminator, is no larger than the size of a `char*` (a char pointer), this code is flat-out wrong. Anything longer and this is a recipe for invoking undefined behavior. Further `free(m)` is passed an indeterminate address. Are you sure you're not thinking of `getline()` ? – WhozCraig Apr 11 '15 at 10:49
  • this invokes undefined behaviour because 'm' does not point to anything in particular. so could be reading into anywhere in memory. That is a very good way to cause a seg fault event. – user3629249 Apr 12 '15 at 00:00
  • conio.h is not portable. '%ms' is not portable – user3629249 Apr 12 '15 at 00:12