0

I need to read a finite yet unbounded-in-length string. We learned only about scanf so I guess I cannot use fgets. Anyway, I've ran this code on a an input with length larger than 5.

char arr[5];
scanf("%s", arr);

char *s = arr;
while (*s != '\0')
    printf("%c", *s++);

scanf keeps scanning and writing the overflowed part, but it seems like an hack. Is that a good practice? If not, how should I read it?

Note: We have learned about the alloc functions family.

Elimination
  • 2,619
  • 4
  • 22
  • 38

6 Answers6

1

scanf is the wrong tool for this job (as for most jobs). If you are required to use this function, read one char at a time with scanf("%c", &c).

You code misuses scanf(): you are passing arr, the address of an array of pointers to char instead of an array of char.

You should allocate an array of char with malloc, read characters into it and use realloc to extend it when it is too small, until you get a '\n' or EOF.

If you can rewind stdin, you can first compute the number of chars to read with scanf("%*s%n", &n);, then allocate the destination array to n+1 bytes, rewind(stdin); and re-read the string into the buffer with scanf("%s", buf);. It is risky business as some streams such as console input cannot be rewinded.

For example:

fpos_t pos;
int n = 0;
char *buf;

fgetpos(stdin, &pos);
scanf("%*[^\n]%n", &n);
fsetpos(stdin, &pos);
buf = calloc(n+1, 1);
scanf("%[^\n]", buf);

Since you are supposed to know just some basic C, I doubt this solution is what is expected from you, but I cannot think of any other way to read an unbounded string in one step using standard C.

If you are using the glibc and may use extensions, you can do this:

scanf("%a[^\n]", &buf);

PS: all error checking and handling is purposely ignored, but should be handled in you actual assignment.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • How do I read "read characters into it"? (I need to **read all at once** from the console) – Elimination Apr 16 '15 at 21:10
  • `scanf("%c", buf + offset)` reads one `char` at position `offset` in `buf` declared as `char *buf` and allocated with `buf = malloc(size)`. `scanf` returns `1` if the char was read or `EOF` at end if file. – chqrlie Apr 16 '15 at 21:13
  • Since you are supposed to read characters with `scanf`, use the format `"%c"` to read one `char` at a time. Technically, you could use `"%99[^\n]"` to read at most 99 chars into a buffer and stop on `'\n'` but is it probably more advanced than you are supposed to be. – chqrlie Apr 16 '15 at 21:16
  • some C libraries have extensions to `scanf()` to allocate the destination array, but I doubt you are supposed to use that. Maybe you can use non standard function `getline()`. – chqrlie Apr 16 '15 at 21:20
  • Cannot use external libraries as it's an exercise of a basic C programming course. – Elimination Apr 16 '15 at 21:21
  • `scanf("%*[^\n]%n", ...);` fails if first character is `'\n'`. `n` is never set. Code then exhibits UB. – chux - Reinstate Monica Apr 16 '15 at 23:27
1

%as or %ms(POSIX) can be used for such purpose If you are using gcc with glibc.(not C standard)

#include <stdio.h>
#include <stdlib.h>

int main(void){
    char *s;
    scanf("%as", &s);
    printf("%s\n", s);
    free(s);
    return 0;
}
BLUEPIXY
  • 39,699
  • 7
  • 33
  • 70
  • @chqrlie OP use `"%s"`. but can use `"%m[^\n]"`. – BLUEPIXY Apr 16 '15 at 23:36
  • OP mistakenly uses `"%s"`. He also mentioned `fgets()`. I asked him to be more concise and he said reading stops at *end of line*. Read the comments on the main question. – chqrlie Apr 16 '15 at 23:39
1

Buffer overflows are a plague, of the most famous and yet most elusive bugs. So you should definitely not rely on them.

Since you've learned about malloc() and friends, I suppose you're expected to make use of them.

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

// Array growing step size
#define CHUNK_SIZE  8

int main(void) {
    size_t arrSize = CHUNK_SIZE;
    char *arr = malloc(arrSize);
    if(!arr) {
            fprintf(stderr, "Initial allocation failed.\n");
            goto failure;
        }

    // One past the end of the array
    // (next insertion position)
    size_t arrEnd = 0u;

    for(char c = '\0'; c != '\n';) {
        if(scanf("%c", &c) != 1) {
            fprintf(stderr, "Reading character %zu failed.\n", arrEnd);
            goto failure;
        }

        // No more room, grow the array
        // (-1) takes into account the
        // nul terminator.
        if(arrEnd == arrSize - 1) {
            arrSize += CHUNK_SIZE;
            char *newArr = realloc(arr, arrSize);
            if(!newArr) {
                fprintf(stderr, "Reallocation failed.\n");
                goto failure;
            }
            arr = newArr;

            // Debug output
            arr[arrEnd] = '\0';
            printf("> %s\n", arr);
            // Debug output
        }

        // Append the character and
        // advance the end index
        arr[arrEnd++] = c;
    }
    // Nul-terminate the array
    arr[arrEnd++] = '\0';

    // Done !
    printf("%s", arr);

    free(arr);
    return 0;

failure:
    free(arr);
    return 1;
}
Quentin
  • 62,093
  • 7
  • 131
  • 191
0

Try limiting the amount of characters accepted:

scanf("%4s", arr);
Deanie
  • 2,316
  • 2
  • 19
  • 35
0

It's just that you're writing beyond arr[5]. "Hopefully" you're keeping writing on allocated memory of the process, but if you go beyond you'll end up with a segmentation fault.

Amessihel
  • 5,891
  • 3
  • 16
  • 40
0

Consider

1) malloc() on many systems only allocates memory, not uses it. It isn't until the memory is assigned that the underlining physical memory usage occurs. See Why is malloc not "using up" the memory on my computer?

2) Unbounded user input is not realistic. Given that some upper bound should be employed to prevent hackers and nefarious users, simple use a large buffer.

If you system can work with these two ideas:

char *buf = malloc(1000000);
if (buf == NULL) return NULL; // Out_of_memory
if (scanf("%999999s", buf) != 1) { free(buf); return NULL; } //EOF

// Now right-size buffer
size_t size = strlen(buf) + 1;
char *tmp = realloc(buf, size);
if (tmp == NULL) { free(buf);  return NULL; } // Out_of_memory
return tmp;

Fixed up per @chqrlie comments.

Community
  • 1
  • 1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    `size_t len = strlen(buf) + 1;` is misleading, better name this local variable `size`. `memcpy(tmp, buf, len);` is useless. `free(buf);` is a bug: it will either free `tmp` before returning it or double free `buf`. Also why return `NULL` if `realloc()` fails? should return `buf` instead. – chqrlie Apr 16 '15 at 23:44
  • @chqrlie Completely agree on 3 of 4 points. If `realloc()` fails, the function failed. Returning `buf` would not allow the calling code to distinguish between success/failure. IAC there are number of error handling issue left unspecified by OP, so this is a rough idea. – chux - Reinstate Monica Apr 17 '15 at 01:09