6

Is it possible to read lines of text with scanf() - excluding \n and break on special(chosen) character, but include that character

This is my current expression: while(scanf("%49[^:\n]%*c", x)==1) but this one excludes :. Is it possible to break reading on : but read that character too?

Iluvatar
  • 642
  • 7
  • 19
  • I don't think so. scanf is a pretty limited tool. I think you'd almost always be better off using other functions to read the characaters (e.g., fgets or getc) and then if necessary use sscanf to do any parsing or conversion. Though I usually just avoid scanf and sscanf altogether. – Waxrat Dec 24 '16 at 17:18
  • 1
    `scanf()` can't *include* the chosen (limiting) character when scanning. Why not go with `fgets()` by the way? – P.P Dec 24 '16 at 17:19
  • If you want to present solution with fgets() that's fine by me. – Iluvatar Dec 24 '16 at 17:26
  • `fgets()` will stop on newline but it can't be worked around. I depends on what you want to do with the rest of the chars that `fgets()` may read. Do you want to ignore the rest or treat them as part of the *next* "line" ? Perhaps, post an example. – P.P Dec 24 '16 at 17:31
  • 1
    `while(scanf("%49[^:\n]%*c", x)==1)` _does_ read the `:` via the `"%*c"`. Just read with `%c` and then append that lone charcter to `x`. IAC, to read a _line_ of text, use `fgets()`. – chux - Reinstate Monica Dec 24 '16 at 17:32
  • 2
    Use `fgets`, then repeatedly call `strchr` on the buffer until your 'special' char is no longer found. Otherwise do as @chux suggests, testing the `char` returned as `c` and append if `special` char or discard if `'\n'`. – David C. Rankin Dec 24 '16 at 17:33
  • If memory usage isn't a problem, [`getline()` or `getdelim()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/getdelim.html) are probably better choices than `fgets()` as `getline()`/`getdelim()` handle lines of arbitrary length. – Andrew Henle Dec 24 '16 at 17:43

4 Answers4

5

Ok I am using Johannes-Schaub-litb's code.

char * getline(char cp) {
    char * line = malloc(100), * linep = line;
    size_t lenmax = 100, len = lenmax;
    int c;


    if(line == NULL)
        return NULL;

    for(;;) {
        c = fgetc(stdin);
        if(c == EOF)
            break;

        if(--len == 0) {
            len = lenmax;
            intptr_t diff = line - linep;
            char * linen = realloc(linep, lenmax *= 2);

            if(linen == NULL) {
                free(linep);
                return NULL;
            }
            line = linen + diff;
            linep = linen;
        }

        if((*line++ = c) == cp)
            break;
    }
    *line = '\0';
    return linep;
}

Still I use this code ...and it works fine. The code will be modified a bit more later.

user2736738
  • 30,591
  • 5
  • 42
  • 56
  • 1
    There is no reason to discard what was read when `realloc` fails. Just return what was read and return NULL the next time (unless the specification is "successfully reads a line until end-of-line") – Paul Ogilvie Dec 24 '16 at 17:42
  • @PaulOgilvie.: ah to be clear...you are saying `if(linen == NULL) { return linep;}` what do you mean by return NULL the next time? – user2736738 Dec 24 '16 at 17:45
  • 1
    I mean that the next time a new attempt to allocate memory will be made and if that fails, then return NULL. Postpone failure until the last possible moment. – Paul Ogilvie Dec 24 '16 at 17:52
  • 1
    @PaulOgilvie.: Pardon me...the realloc is called when the amount of memory available is 0. So we need to allocate and then if we can't...we are returning NULL...where is the next time?? – user2736738 Dec 24 '16 at 17:57
  • 1
    So `realloc` fails and you return what you have. Then the caller wants the next line and calls `getline`, which does`malloc` which now can fail upon which you return null. Capiche? (Signing off for Christmas - Merry Christmas to you all.) – Paul Ogilvie Dec 24 '16 at 18:03
  • @PaulOgilvie I disagree. That introduces inconsistent behavior, and violates the specification of the function. If you return part of a line, then it's a bug. It's either the whole line or bust. – Patrick Roberts Dec 24 '16 at 18:06
  • 1
    @PaulOgilvie.: Merry Christmas...and I get what you are saying...but I guess that puts lots of responsibilty on user...isn't it? – user2736738 Dec 24 '16 at 18:07
  • Note that `fgets()` return `NULL` on 1) end-of-file and no data read or 2) input error detected. That functionality differs from this `getline()`. – chux - Reinstate Monica Dec 24 '16 at 18:23
  • "Code is really well written" --> Disagree. `line = linen + (line - linep);` references `line` and `linep`, both are pointers to free'd memory --> UB. – chux - Reinstate Monica Dec 24 '16 at 18:50
  • @chux.:Thanks for noticing...would you suggest edits? or I may edit it when I get time... – user2736738 Dec 24 '16 at 18:52
  • Edit at your leisure - MC. – chux - Reinstate Monica Dec 24 '16 at 18:54
  • @chux.:and if possible would you clarify a bit? in which case they are null? – user2736738 Dec 24 '16 at 18:56
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/131422/discussion-between-chux-and-coderredoc). – chux - Reinstate Monica Dec 24 '16 at 19:11
0

I have done this in a little different way. Maybe this can crash on Windows.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    char *input_str;
    
    /**
     * Dynamic memory allocation.
     * Might crash on windows.
     */
    int status = scanf("%m[^.]", &input_str);
    
    /**
     * If the first character is the 
     * terminating character then scanf scans nothing
     * and returns 0.
     */
    if (status > 0) {
        /**
         * Calculate the length of the string.
         */
        size_t len = strlen(input_str);
    
        /**
         * While allocating memory provide
         * two extra cell. One for the character 
         * you want to include.
         * One for the NULL character.
         */
        char *new_str = (char*) calloc (len + 2, sizeof(char));
    
        /**
         * Check for memory allocation.
         */
        if(new_str == NULL) {
            printf("Memory Allocation failed\n");
            exit(1);
        }

        /**
         * Copy the string.
         */
        strcpy(new_str, input_str, len);
    
        /**
         * get your desired terminating character 
         * from buffer
         */
        new_str[len++] = getc(stdin);
    
        /**
         * Append the NULL character
         */
        new_str[len++] = '\0';
    
        /**
         * eat other characters
         * from buffer.
         */
        while(getc(stdin) != '\n');
    
        /**
         * Free the memory used in
         * dynamic memory allocation
         * in scanf. Which is a must
         * according to the scanf man page.
         */
        free(input_str);
    } else {
        char new_str[2] = ".\0";
        /**
         * eat other characters
         * from buffer.
         */
        while(getc(stdin) != '\n');
    }
}
    

I have used dot as a terminating character.

VLAZ
  • 26,331
  • 9
  • 49
  • 67
Parnab Sanyal
  • 749
  • 5
  • 19
  • There is no security issue with `strcpy(new_str, input_str);`. There are other problems like 1) `len` should have been `szie_t`, not `int` and the result of `calloc()` should be checked against `NULL`. 3) No check for `EOF` on `new_str[len++] = getc(stdin);` – chux - Reinstate Monica Dec 24 '16 at 18:57
  • I have read that strcpy can cause bufferoverflow attack. So, use of strncpy can prevent it. – Parnab Sanyal Dec 24 '16 at 18:58
  • I have made the changes you have suggested @chux – Parnab Sanyal Dec 24 '16 at 19:02
  • `strncpy()` has it own set of problems. [Example](http://stackoverflow.com/q/21210293/2410359). The point is this code does not need `strncpy()` here, yet touts using it for greater security, while missing other relevant concerns. – chux - Reinstate Monica Dec 24 '16 at 19:05
  • 1
    What does `scanf("%m[^.]", &input_str);` do if the _first_ character is `'.'`? I suspect nothing is saved into `input_str`, no memory allocated. Code should have checked the return value of `scanf()` before using `input_str`. – chux - Reinstate Monica Dec 24 '16 at 19:10
  • thanks for pointing out that. I was getting `segmentation fault`. In `gdb` it turned out that, it was caused in `strlen`. @chux – Parnab Sanyal Dec 24 '16 at 19:28
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/131425/discussion-between-chux-and-parnab-sanyal). – chux - Reinstate Monica Dec 24 '16 at 19:40
0

Is it possible to read lines of text with scanf() - excluding \n and break on special(chosen) character, but include that character(?)

Yes. But scanf() is notorious for being used wrong and difficult to use right. Certainly the scanf() approach will work for most user input. Only a rare implementation will completely meet OP's goal without missing corner cases. IMO, it is not worth it.

Alternatively, let us try the direct approach, repeatedly use fgetc(). Some untested code:

char *luvatar_readline(char *destination, size_t size, char special) {
  assert(size > 0);
  char *p = destitution;

  int ch;
  while (((ch = fgetc(stdin)) != EOF) && (ch != '\n')) {
    if (size > 1) {
      size--;  
      *p++ = ch; 
    } else {
      // Ignore extra input or 
      // TBD what should be done if no room left
    }
    if (ch == (unsigned char) special) {
      break;
    }
  }
  *p = '\0';

  if (ch == EOF) {
    // If rare input error
    if (ferror(stdin)) {
      return NULL;
    }
    // If nothing read and end-of-file  
    if ((p == destination) && feof(stdin)) {
      return NULL;
    }
  }
  return destination;
}      

Sample usage

char buffer[50];
while (luvatar_readline(buffer, sizeof buffer_t, ':')) { 
  puts(buffer);
}

Corner cases TBD: Unclear what OP wants if special is '\n' or '\0'.


OP's while(scanf("%49[^:\n]%*c", x)==1) has many problems.

  1. Does not cope with input the begins with : or '\n', leaving x unset.

  2. Does not know if the character after the non-:, non-'\n' input was a :, '\n', EOF.

  3. Does not consume extra input past 49.

  4. Uses a fixed spacial character ':', rather than a general one.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    Thank you for proposed solution. I'm using `fgets` and `strchr` to read and cut whole line of text and right now I can handle empty string or single char (even chosen delimiter). I haven't tested yet but i if your code does better I will use it; btw. my nickname is Iluvatar (starts with `i`). – Iluvatar Dec 25 '16 at 02:48
  • @Iluvatar `fgets/strchr` is the better solution. Go with that, maybe even post your own answer. – chux - Reinstate Monica Dec 25 '16 at 07:06
0

I think that you want to do that:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
  char *line = NULL;
  size_t len = 0;
  ssize_t read;
  while ((read = getline(&line, &len, stdin)) != -1) {
    if (read > 0 && line[read - 1] == '\n') {
      if (read > 1 && line[read - 2] == '\r') {
        line[read - 2] = '\0'; // we can remove the carriage return
      }
      else {
        line[read - 1] = '\0'; // we can remove the new line
      }
    }

    char const *delim = ":";
    printf("parsing line :\n");
    char *token = strtok(line, delim);

    while (token != NULL) {
      printf("token: %s\n", token);
      token = strtok(NULL, delim);
    }
  }
  free(line);
}
Stargateur
  • 24,473
  • 8
  • 65
  • 91