Why does this fscanf() read garbage values

Question

Scanned file looks like that:

Casablanca
1942 D 6.5 4.5 6.0 8.0 7.5
Capernaum
2018 D 5.5 4.5 8.0 8.0 6.5
Stranger Than Paradise
1984 D 6.5 5.5 6.0 8.0 4.5
Three Colors: Red
1994 D 6.5 8.5 6.0 8.0 8.5
Good Bye Lenin!
2003 C 7.5 3.5 6.0 8.0 9.5
Perfume: The Story of a Murderer
2006 D 6.5 5.5 6.0 8.0 5.5
The Shawshank Redemption
1994 D 7.5 5.5 6.0 8.0 8.5
Hacksaw Ridge
2016 D 7.5 7.5 6.0 8.0 7.5
Lost in Translation
2003 D 6.5 4.5 6.0 8.0 7.5
Forrest Gump
1994 D 6.5 9.5 6.0 8.0 6.5
Sleepless in Seattle
1993 R 5.5 4.5 6.0 8.5 7.5
Pride and Prejudice
2005 R 7.5 4.5 7.0 8.0 8.5

I've tried scanning it like this, so basically I want to scan all values from two rows and bind it to their respective places in struct:

while(fscanf(f, "%[^\n]s %d %c %f %f %f %f %f", &filmy[licznik].nazwa, &filmy[licznik].rok, &filmy[licznik].rodzaj, &filmy[licznik].oceny[0], &filmy[licznik].oceny[1], &filmy[licznik].oceny[2], &filmy[licznik].oceny[3], &filmy[licznik].oceny[4]) != EOF)

and when I print it later on using printf()

int i;
for(i = 0; i < N; i++)
{
printf("%s\n%d %c %.1f %.1f %.1f %.1f %.1f\n", filmy[i].nazwa, filmy[i].rok, filmy[i].rodzaj, filmy[i].oceny[0], filmy[i].oceny[1], filmy[i].oceny[2], filmy[i].oceny[3], filmy[i].oceny[4]);
}

instead of getting the output looking exactly as did the input file, I'm getting this:

Casablanca
11801600 0.0 0.0 0.0 0.0 0.0

70 0.0 0.0 0.0 0.0 0.0
F
5 0.0 0.0 0.0 0.0 0.0
P☺┤
0 0.0 0.0 0.0 0.0 0.0

80 0.0 0.0 0.0 0.0 -0.0
Ü2┤
6619204 s 0.0 0.0 0.0 0.0 -0.0
k
7274608 g 0.0 0.0 0.0 0.0 0.0

5 11791315968.0 0.0 0.0 0.0 0.0

0 0.0 0.0 0.0 0.0 0.0

11801176 1834304256.0 0.0 0.0 0.0 0.0
@§@
1322953350 0.0 0.0 0.0 0.0 0.0
►
8 0.0 0.0 0.0 0.0 0.0

Pretty sure that '%[^\n]s' causes the problem, but I have no idea how to scan the title that includes more than one word without using it.

`%[^\n]s` is incorrect. That tries to match a literal `s`. You want `%[^\n]` — William Pursell, Dec 09 '21 at 18:02
Changed the parameter of `fscanf()` to `"%[^\n] %d %c %f %f %f %f %f\n"`, but there's still an error while reading the last row, so instead of getting this: Pride and Prejudice 2005 R 7.5 4.5 7.0 8.0 8.5 I'm getting that: ► 8 0.0 0.0 0.0 0.0 0.0 — Mateusz, Dec 09 '21 at 18:10
If you want to parse something that is of any complexity at all, write a real parser rather than relying on the feature-poor `scanf`. But in this case, it's probably fine to use a combination of `fgets` and `sscanf`. But you should always check the return value of the `scanf` functions to verify the number of inputs it matched. — Cheatah, Dec 09 '21 at 18:15
Try adding a space and use `" %[^\n]`. You're probably getting a short read. — William Pursell, Dec 09 '21 at 18:16
Without that space, `%[^\n]` in `"%[^\n] %d %c %f %f %f %f %f"` will stop at the *previous* newline. Read up on how whitespace is handled differently by different format specifiers. Also the trailing newline in your [comment](https://stackoverflow.com/questions/70294604/why-does-this-fscanf-read-garbage-values#comment124261788_70294604) is [wrong](https://stackoverflow.com/questions/19499060/what-is-the-effect-of-trailing-white-space-in-a-scanf-format-string). — Weather Vane, Dec 09 '21 at 18:21
Also, right here is your first encounter with globalization: Write your programs in English. — Peter - Reinstate Monica, Dec 09 '21 at 18:23

William Pursell · Answer 1 · 2021-12-09T18:43:54.217

%[^\n]s is incorrect. That tries to match a literal s. You want %[^\n]

Also, you should check that scanf writes as many conversions as you expect. eg

while( fscanf(f, " %[^\n] %d %c %f %f %f %f %f", ...) == 8 )

Note the added space before the first conversion specifier. The [ conversion does not consume whitespace, so if the next character in the input stream is a newline character the fscanf will immediately return without reading any data. Since your scans stop reading at the trailing float on the previous line, it is often the case that the next character in the input stream is a newline.

Also, you really want to protect against buffer overflow by putting restrictions on the amount of data that %[^\n] will read. You want to only read one less that the size of the buffer being written to. So if nazwa is of size 512, you should write:

while( fscanf(f, " %511[^\n] %d %c %f %f %f %f %f", ...) == 8 )

you have similar issues with %f and %d, and the behavior is undefined if the input contains values that cannot be represented in an int or a float, but it's not generally practical to worry about that. If you're worried about that, you shouldn't be using scanf. It's a truism that you shouldn't be using scanf, though.

Why does this fscanf() read garbage values

1 Answers1