-1

Hello I need a better explanation of the fscanf parameter. I have the following text file

this is line 1
this is line 2
this is line 3
...

read out i do with

for(int i=0;i<2;++i){
   test[255];
   fscanf(fp,"%[\n]",test);
   printf("%s\n",test);
}

now I get:
this is line 1
this is line 1
with "%[^\n]\n"
I get
this is line 1
this is line 2

now I break my statement apart as far as I understand it: % means read it unformated (%s would give me a string %c single character...) [^\n] until you get something that does not match in this case newline

Could you explain me the function of the square brackets better and the termination. I read the official explanations but don't understand them fully.

extension 1: of my question. I am aware that there are more easy to use options to achieve my goal. But I just try to understand the syntax of fscanf.

extension 2: when I understand it right

fscanf(fp,"%[^\n]%*c",test)

reads until newline and skips the next character which IS the newline. Following this logic %[^\n] would be every character except newline. I could write

for(int i=0;i<2;++i){
   test[255];
   fscanf(fp,"%[a-z]",test);
   printf("%s\n",test);
}

and I would expect to get

this
is

But I get

ٷ�
ٷ�

extension 3 question is not duplicate to scanf() leaves the new line char in the buffer as i want to read a complete line

1 Answers1

2

Using the scanf family of functions is an exercise in accounting for what remains in your input stream. The format specifier %[^\n] uses the character class (not a '\n') to read up until the newline -- allowing string input with spaces. A character class (e.g. [...]) matches what is within the brackets, a '^' as the first character in the character class inverts the match. When the conversion is complete, the '\n' line-end is left in the input buffer.

Since the string conversion ends when it encounters a newline, without removing the '\n' that remains in the input buffer, your next attempt to read terminates without any characters being read (an input failure), the value of test is left unchanged (holding the value of the first line) so when you print it again, you get the same thing.

From man 3 scanf

s  Matches a sequence of non-white-space characters; ...  The input string stops
   at white space or at the maximum field width, whichever occurs first.

[  Matches a nonempty sequence of characters from the specified set of accepted 
    characters; ... The usual skip of leading white space is suppressed.

When you change the format string to include a newline ("%[^\n]\n"), you are simply removing the newline that remains. So your next attempt to read sees the first character of the second line and it reads correctly. You could also use ("%[^\n]%*c") using '*', the assignment suppression operator, to read and discard the next character following the character class.

This pitfall (failing to account for characters that remain in the input buffer) is one of the primary reasons new C programmers are encouraged to use fgets (or POSIX getline) for line-oriented input as both functions read up to (and include) the newline in the buffers they fill -- eliminating the potential for leaving it unread -- just waiting to cause problems for your next read.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • Thank you for the good explanation. But one thing I don't understand when I do "%[a-z]" i get garbage. But what is the difference to %[^\n]. sure the later accepts everything except newline. But shouldn't it be "this" from "this is line 1". as every character from a-z is alowed. – Felix Yah Batta Man Mar 08 '18 at 04:46
  • It will `[a-z]` will match lower case characters only. No space, etc.. It will not read the trailing newline either. You can use `"%254[a-z]"` to properly limit your input to `254` chars to avoid filling beyond your array bounds. You should not be getting garbage, but unless you are accounting for any non-lower case characters and the `'\n'`, you will experience problems. Suggest `fgets (test, sizeof test, fp);` instead of `fscanf`. Then all you need to do is `size_t len = strlen (test); if (len && test[len-1] == '\n') test[--len] = 0;` to remove the trailing newline from `test`. – David C. Rankin Mar 08 '18 at 04:57
  • If you want to edit your question and add the code you are having problems with at the end of your current question, I'm happy to help. It's hard for me to troubleshoot code I'm having to guess at `:)` (just drop another comment letting me know you added code and I'll come take a look) – David C. Rankin Mar 08 '18 at 05:00
  • added another comment – Felix Yah Batta Man Mar 08 '18 at 06:17