1

I am trying to read a formatted input file where it contains 3 columns using fscanf() in C. Each line's format is 3 characters separated by a space like this

a hi c
i hello l
  abc z  //issue
2 mystr k 
...

The issue is it is not reading the column when there is a white space. My fscanf() function is the following:

while (fscanf(matrix, "%c %s %c\n", &one, &two, &three) != EOF) {
    ...
}

So say in the above case line#3, I want the white space to store in the variable "one".

I have tried some other suggestions online such as "%*1[\t]" but it does not help. Is there anyway that I can make this fscanf() reads the white space then store it into the char? Thanks

apix
  • 11
  • 2
  • What behavior do you get? What happens if the line with white space at the start is the first line? – Solomon Ucko Feb 28 '23 at 04:10
  • 3
    The trailing newline in the format string is a UX disaster, and breaks your input when the data comes from a file. The `scanf()` family are lackadaisical about space; they'll skip it at the slightest opportunity (except for `%c`, `%n`, or `%[…]` (scan sets)). You'll need to read each line and then parse it with `sscanf()` and your format string. – Jonathan Leffler Feb 28 '23 at 04:23
  • 1
    Please show a [mcve]. – n. m. could be an AI Feb 28 '23 at 08:14
  • 1
    What is the data type of `one`, `two`, and `three`? Are they of the same type? Please provide a [mre]. – Andreas Wenzel Feb 28 '23 at 08:16

2 Answers2

3

First enable all compiler warnings.

--

Could use fscanf(matrix, "%c%*c%5s%*c%c%*c", &one, two, &three) == 3

But better to read a line with fgets() and then parse.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
2

The problem is that the fscanf format string "%c %s %c\n" contains a trailing whitespace character (in this case \n). This whitespace character tells fscanf to read and discard all whitespace characters from the input stream until it encounters a non-whitespace character.

Therefore, when fscanf reads the lines

i hello l
  abc z

it will successfully match l, hello and l, but then it will read and discard the whitespace characters "\n ". However, you only want it to read and discard "\n".

There are several ways of solving this problem:

  1. Instead of using a whitespace character which instructs fscanf to read as many whitespace characters as possible, you can use the %*c format specifier to read and discard only a single character. However, this will only work if that character is guaranteed to be the newline character.

  2. A more robust solution would be to first read the line using fgets instead of fscanf. In contrast to fscanf, fgets will always read exactly one entire line (if it can), so it will not leave any characters of a line on the input stream, and it also will not attempt to read characters of the next line. After reading the line and storing the line in a memory buffer as a string, you can use sscanf on that memory buffer.

Here is an example of the second solution:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>

//forward function declaration
bool get_line_from_stream( char buffer[], int buffer_size, FILE *fp );

int main( void )
{
    FILE *fp;
    char line[200];

    fp = fopen( "input.txt", "r" );
    if ( fp == NULL )
    {
        fprintf( stderr, "Error opening file!\n" );
        exit( EXIT_FAILURE );
    }

    while ( get_line_from_stream( line, sizeof line, fp ) )
    {
        char one, two[20], three;

        if ( sscanf( line, "%c %19s %c", &one, two, &three ) == 3 )
        {
            printf(
                "Successfully read the following record:\n"
                "  One: %c\n"
                "  Two: %s\n"
                "  Three: %c\n",
                one, two, three
            );
        }
        else
        {
            fprintf( stderr, "Parsing error!\n" );
        }
    }

    fclose( fp );
}

//This function will read exactly one line of input and remove the
//newline character, if it exists. On success, it will return true.
//If this function is unable to read any further lines due to
//end-of-file, it returns false. If it fails for any other reason, it
//will not return, but will print an error message and call "exit"
//instead.
bool get_line_from_stream( char buffer[], int buffer_size, FILE *fp )
{
    char *p;

    //attempt to read one line from the stream
    if ( fgets( buffer, buffer_size, fp ) == NULL )
    {
        if ( !feof(fp) )
        {
            fprintf( stderr, "Input error!\n" );
            exit( EXIT_FAILURE );
        }

        return false;
    }

    //make sure that line was not too long for input buffer
    p = strchr( buffer, '\n' );
    if ( p == NULL )
    {
        //a missing newline character is ok if the next
        //character is a newline character or if we have
        //reached end-of-file
        if ( !feof(fp) && getc(fp) != '\n' )
        {
            fprintf( stderr, "Line is too long for memory buffer!\n" );
            exit( EXIT_FAILURE );
        }
    }
    else
    {
        //remove newline character by overwriting it with a null
        //character
        *p = '\0';
    }

    return true;
}

For the input

a hi c
i hello l
  abc z
2 mystr k

this program has the following output:

Successfully read the following record:
  One: a
  Two: hi
  Three: c
Successfully read the following record:
  One: i
  Two: hello
  Three: l
Successfully read the following record:
  One:  
  Two: abc
  Three: z
Successfully read the following record:
  One: 2
  Two: mystr
  Three: k
Andreas Wenzel
  • 22,760
  • 4
  • 24
  • 39