Dynamically allocate user inputted string

Question

I am trying to write a function that does the following things:

Start an input loop, printing '> ' each iteration.
Take whatever the user enters (unknown length) and read it into a character array, dynamically allocating the size of the array if necessary. The user-entered line will end at a newline character.
Add a null byte, '\0', to the end of the character array.
Loop terminates when the user enters a blank line: '\n'

This is what I've currently written:

void input_loop(){
    char *str = NULL;

    printf("> ");

    while(printf("> ") && scanf("%a[^\n]%*c",&input) == 1){

        /*Add null byte to the end of str*/

        /*Do stuff to input, including traversing until the null byte is reached*/

        free(str);
        str = NULL;
    }
    free(str);
    str = NULL;
}

Now, I'm not too sure how to go about adding the null byte to the end of the string. I was thinking something like this:

last_index = strlen(str);
str[last_index] = '\0';

But I'm not too sure if that would work though. I can't test if it would work because I'm encountering this error when I try to compile my code:

warning: ISO C does not support the 'a' scanf flag [-Wformat=]

So what can I do to make my code work?

EDIT: changing scanf("%a[^\n]%*c",&input) == 1 to scanf("%as[^\n]%*c",&input) == 1 gives me the same error.

You can't use `strlen` to get the last index assuming your string doesn't have a null at the end of it. If it does have a null at the end, then why are you re-adding it? — ozdrgnaDiies, May 16 '15 at 03:59

hugomg · Accepted Answer · 2015-05-16T04:20:08.730

2

First of all, scanf format strings do not use regular expressions, so I don't think something close to what you want will work. As for the error you get, according to my trusty manual, the %a conversion flag is for floating point numbers, but it only works on C99 (and your compiler is probably configured for C90)

But then you have a bigger problem. scanf expects that you pass it a previously allocated empty buffer for it to fill in with the read input. It does not malloc the sctring for you so your attempts at initializing str to NULL and the corresponding frees will not work with scanf.

The simplest thing you can do is to give up on n arbritrary length strings. Create a large buffer and forbid inputs that are longer than that.

You can then use the fgets function to populate your buffer. To check if it managed to read the full line, check if your string ends with a "\n".

char str[256+1];
while(true){
    printf("> ");
    if(!fgets(str, sizeof str, stdin)){
        //error or end of file
        break;
    }

    size_t len = strlen(str);
    if(len + 1 == sizeof str){
        //user typed something too long
        exit(1);
    }

    printf("user typed %s", str);
}

Another alternative is you can use a nonstandard library function. For example, in Linux there is the getline function that reads a full line of input using malloc behind the scenes.

edited May 16 '15 at 04:20

answered May 16 '15 at 04:00

hugomg

68,213
24
160
246

I'm not really sure how to go about using `fgets`. It seems king of confusing to me. Can you explain it to me like I'm five? EDIT: Also, how could I incorporate it into my input loop? – JavascriptLoser May 16 '15 at 04:08
2

You simply allocate a string beforehand with enough space according to your needs. For input, pick a large enough number like 100, 256, 512, etc depending on what you want. `fgets` takes 3 parameters: The location to put the string, the max length of the string + the null which it appends automatically, and the place to read the input from. For the input, you can specify `stdin` to read from the console or a file handle if you have it. So for instance, if you had a `char str[512];`, you'd call `fgets(str, 512 - 1, stdin);`. The -1 on the size is for the null. `fgets` returns null on failure. – ozdrgnaDiies May 16 '15 at 04:15
@ozdrgnaDiies Does `fgets` add a null byte to the end of the string? – JavascriptLoser May 16 '15 at 04:16
1

@PythonNewb Yes it does. You can read more about it here: http://www.cplusplus.com/reference/cstdio/fgets/ – ozdrgnaDiies May 16 '15 at 04:17
1

@ozdrgnaDiies: Minor point: You don't have to make room for the null byte; `fgets` will take care of that for you. `fgets(str, sizeof(str), stdin);` should be good in your example. – M Oehm May 16 '15 at 04:19
added some examples. please check if I didnt mess up the error condition with an off-by one error. – hugomg May 16 '15 at 04:22
@MOehm Yes you are correct, I don't know where I got that from. – ozdrgnaDiies May 16 '15 at 04:22
@ozdrgnaDiies: Probably from inconsistent string handling in the C lib. In `scanf`, you must add the -1 epxplicitly, which makes it very annoying, – M Oehm May 16 '15 at 04:27

score 1 · Answer 2 · answered May 16 '15 at 04:17

1

No error checking, don't forget to free the pointer when you're done with it. If you use this code to read enormous lines, you deserve all the pain it will bring you.

#include <stdio.h>
#include <stdlib.h>

char *readInfiniteString() {
    int l = 256;
    char *buf = malloc(l);
    int p = 0;
    char ch;

    ch = getchar();
    while(ch != '\n') {
        buf[p++] = ch;
        if (p == l) {
            l += 256;
            buf = realloc(buf, l);
        }
        ch = getchar();
    }
    buf[p] = '\0';

    return buf;
}

int main(int argc, char *argv[]) {
    printf("> ");
    char *buf = readInfiniteString();
    printf("%s\n", buf);
    free(buf);
}

answered May 16 '15 at 04:17

Will Hartung

115,893
19
128
203

2

It should be noted for people who want to use this that if `realloc` fails, `buf` will be lost and the memory will be leaked. You should instead assign the result of `realloc` to a temporary pointer for error checking and reassign it after that. – ozdrgnaDiies May 16 '15 at 04:21
@ozdrgnaDiies: That depends on what you want to do on failure. Do you return a truncated line? If so, how will the calling code know? Often, people just bail out and exit the program. No extra work needed here. – M Oehm May 16 '15 at 04:25
a possible tweak is to do `l *= SOME_CONSTANT_FACTOR` instead of `l += 256`. This way you avoid quadratic runtimes if there is a very long input line. – hugomg May 16 '15 at 04:25
1

`getchar` will return a value between 0 and `UCHAR_MAX` (inclusive) when it's successful (typically one of 256 values), or `EOF` (which brings the total to one of 257 values, typically) when it's indicating failure. If `ch` can't store one of 257 distinct values typically, then you run the risk of not recognising when `EOF` or an error has been flagged... not that this matters, since your loop makes no effort to check that anyway. I highly recommend changing `ch` to be an `int` (as the manual also suggests), for a start. – autistic May 16 '15 at 04:40

score 1 · Answer 3 · answered May 16 '15 at 04:46

If you are on a POSIX system such as Linux, you should have access to getline. It can be made to behave like fgets, but if you start with a null pointer and a zero length, it will take care of memory allocation for you.

You can use in in a loop like this:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>    // for strcmp

int main(void)
{
    char *line = NULL;
    size_t nline = 0;

    for (;;) {
        ptrdiff_t n;

        printf("> ");

        // read line, allocating as necessary
        n = getline(&line, &nline, stdin);
        if (n < 0) break;

        // remove trailing newline
        if (n && line[n - 1] == '\n') line[n - 1] = '\0';

        // do stuff
        printf("'%s'\n", line);
        if (strcmp("quit", line) == 0) break;
    }

    free(line);
    printf("\nBye\n");

    return 0;
}

The passed pointer and the length value must be consistent, so that getline can reallocate memory as required. (That means that you shouldn't change nline or the pointer line in the loop.) If the line fits, the same buffer is used in each pass through the loop, so that you have to free the line string only once, when you're done reading.

autistic · Answer 4 · 2015-05-16T05:54:51.350

Some have mentioned that scanf is probably unsuitable for this purpose. I wouldn't suggest using fgets, either. Though it is slightly more suitable, there are problems that seem difficult to avoid, at least at first. Few C programmers manage to use fgets right the first time without reading the fgets manual in full. The parts most people manage to neglect entirely are:

what happens when the line is too large, and
what happens when EOF or an error is encountered.

The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.

Upon successful completion, fgets() shall return s. If the stream is at end-of-file, the end-of-file indicator for the stream shall be set and fgets() shall return a null pointer. If a read error occurs, the error indicator for the stream shall be set, fgets() shall return a null pointer...

I don't feel I need to stress the importance of checking the return value too much, so I won't mention it again. Suffice to say, if your program doesn't check the return value your program won't know when EOF or an error occurs; your program will probably be caught in an infinite loop.

When no '\n' is present, the remaining bytes of the line are yet to have been read. Thus, fgets will always parse the line at least once, internally. When you introduce extra logic, to check for a '\n', to that, you're parsing the data a second time.

This allows you to realloc the storage and call fgets again if you want to dynamically resize the storage, or discard the remainder of the line (warning the user of the truncation is a good idea), perhaps using something like fscanf(file, "%*[^\n]");.

hugomg mentioned using multiplication in the dynamic resize code to avoid quadratic runtime problems. Along this line, it would be a good idea to avoid parsing the same data over and over each iteration (thus introducing further quadratic runtime problems). This can be achieved by storing the number of bytes you've read (and parsed) somewhere. For example:

char *get_dynamic_line(FILE *f) {
    size_t bytes_read = 0;
    char *bytes = NULL, *temp;
    do {
         size_t alloc_size = bytes_read * 2 + 1;
         temp = realloc(bytes, alloc_size);
         if (temp == NULL) {
             free(bytes);
             return NULL;
         }
         bytes = temp;
         temp = fgets(bytes + bytes_read, alloc_size - bytes_read, f); /* Parsing data the first time  */
         bytes_read += strcspn(bytes + bytes_read, "\n");              /* Parsing data the second time */
    } while (temp && bytes[bytes_read] != '\n');
    bytes[bytes_read] = '\0';
    return bytes;
}

Those who do manage to read the manual and come up with something correct (like this) may soon realise the complexity of an fgets solution is at least twice as poor as the same solution using fgetc. We can avoid parsing data the second time by using fgetc, so using fgetc might seem most appropriate. Alas most C programmers also manage to use fgetc incorrectly when neglecting the fgetc manual.

The most important detail is to realise that fgetc returns an int, not a char. It may return typically one of 256 distinct values, between 0 and UCHAR_MAX (inclusive). It may otherwise return EOF, meaning there are typically 257 distinct values that fgetc (or consequently, getchar) may return. Trying to store those values into a char or unsigned char results in loss of information, specifically the error modes. (Of course, this typical value of 257 will change if CHAR_BIT is greater than 8, and consequently UCHAR_MAX is greater than 255)

char *get_dynamic_line(FILE *f) {
    size_t bytes_read = 0;
    char *bytes = NULL;
    do {
         if ((bytes_read & (bytes_read + 1)) == 0) {
             void *temp = realloc(bytes, bytes_read * 2 + 1);
             if (temp == NULL) {
                 free(bytes);
                 return NULL;
             }
             bytes = temp;
         }

         int c = fgetc(f);
         bytes[bytes_read] = c >= 0 && c != '\n'
                             ? c
                             : '\0';
    } while (bytes[bytes_read++]);
    return bytes;
}

Dynamically allocate user inputted string

4 Answers4

Linked

Related