Let's start with some basics. When reading lines from a file (or lines of input from the user), you will generally want to use a line-oriented input function such as fgets
or POSIX getline
to make sure you read an entire line at a time and not have what is left in your input buffer depend on which scanf
conversion specifier was used last. With fgets
you will need to provide a buffer of sufficient size to hold the entire line, or dynamically allocate and realloc
as needed until an entire line is read (getline
handles this for you). You validate an entire line was read by checking the last character is the '\n'
character or that the length of the buffer is less than the maximum size (both are left to you below).
Once you have a line of text read, you have two options, you can either use sscanf
to convert the digits in your buffer to integer values (either knowing the number contained in the line beforehand and providing an adequate number of conversion specifiers, or by converting each individually and using the "%n"
specifier to report the number of characters extracted for that conversion and incrementing the start within your buffer by that amount for the next conversion)
Your other option, and by far the most flexible and robust from an error checking and reporting standpoint is to use strtol
and use the endptr
parameter for its intended purpose of providing a pointer to one past the last digit converted allowing you to walk down your buffer directly converting values as you go. See: strtol(3) - Linux manual page strtol
provides the ability to discriminate between a failure where no digits were converted, where overflow or underflow occurred (setting errno
to an appropriate value), and allows you to test whether additional characters remain after the conversion through the endptr
parameter for control of your value conversion loop.
As with any code you write, validating each necessary step will ensure you can respond appropriately.
Let's start with your sample input file:
Example Input File
$ cat dat/int_file.txt
5
1 78 45 32 2
When faced with a single value on the first line, a majority of the time you will simply want to convert the original value with fscanf (file, "%d", &ival);
, which is fine, but -- the pitfalls of using any of the scanf
family is YOU must account for any characters left in the input buffer following conversion. While the "%d"
conversion specifier will provide the needed conversion, character extract stops with the last digit, leaving the '\n'
unread. As long as you account for that fact, it's fine to use fscanf
to grab the first value. However, you must validate each step along the way.
Let's look at the beginning of an example doing just that, opening a file (or reading from stdin
if no filename is given), validating the file is open, and then validating first_num
is read, e.g.
#include <stdio.h>
#include <stdlib.h> /* for malloc/free & EXIT_FAILURE */
#include <errno.h> /* for strtol validation */
#include <limits.h> /* for INT_MIN/INT_MAX */
#define MAXC 1024 /* don't skimp on buffer size */
int main (int argc, char **argv) {
int first_num, /* your first_num */
*arr = NULL, /* a pointer to block to fill with int values */
nval = 0; /* the number of values converted */
char buf[MAXC]; /* buffer to hold subsequent lines read */
/* open file passed as 1st argument (default: stdin if no argument) */
FILE *fp = argc > 1 ? fopen (argv[1], "r"): stdin;
if (!fp) { /* validate file open for reading */
perror ("fopen-file");
exit (EXIT_FAILURE);
}
if (fscanf (fp, "%d", &first_num) != 1) { /* read/validate int */
fputs ("error: invalid file format, integer not first.\n", stderr);
exit (EXIT_FAILURE);
}
At this point, your input buffer contains:
\n
1 78 45 32 2
Since you are going to embark on a line-oriented read of the remaining lines in the file, you can simply make your first call to fgets
for the purpose of reading and discarding the '\n'
, e.g.
if (!fgets (buf, MAXC, fp)) { /* read/discard '\n' */
fputs ("error: non-POSIX ending after 1st integer.\n", stderr);
exit (EXIT_FAILURE);
}
(note: the validation. If the file had ended with a non-POSIX line end (e.g. no '\n'
), fgets
would fail and unless you are checking, you are likely to invoke undefined behavior by attempting to later read from a file stream where no characters remain to be read and thereafter attempting to read from a buffer with indeterminate contents)
You can allocate storage for first_num
number of integers at this point and assign the starting address for that new block to arr
for filling with integer values, e.g.
/* allocate/validate storage for first_num integers */
if (!(arr = malloc (first_num * sizeof *arr))) {
perror ("malloc-arr");
exit (EXIT_FAILURE);
}
For reading the remaining values in your file, you could just make a single call to fgets
and then turn to converting the integer values contained within the buffer filled, but with just a little forethought, you can craft an approach that will read as many lines as needed until first_num
integers have been converted or EOF
is encountered. Whether you are taking input or converting values in a buffer, a robust approach is the same Loop Continually Until You Get What You Need Or Run Out Of Data, e.g.
while (fgets (buf, MAXC, fp)) { /* read lines until conversions made */
char *p = buf, /* nptr & endptr for strtol conversion */
*endptr;
if (*p == '\n') /* skip blank lines */
continue;
while (nval < first_num) { /* loop until nval == first_num */
errno = 0; /* reset errno for each conversion */
long tmp = strtol (p, &endptr, 0); /* call strtol */
if (p == endptr && tmp == 0) { /* validate digits converted */
/* no digits converted - scan forward to next +/- or [0-9] */
do
p++;
while (*p && *p != '+' && *p != '-' &&
( *p < '0' || '9' < *p));
if (*p) /* valid start of numeric sequence? */
continue; /* go attempt next conversion */
else
break; /* go read next line */
}
else if (errno) { /* validate successful conversion */
fputs ("error: overflow/underflow in conversion.\n", stderr);
exit (EXIT_FAILURE);
}
else if (tmp < INT_MIN || INT_MAX < tmp) { /* validate int */
fputs ("error: value exceeds range of 'int'.\n", stderr);
exit (EXIT_FAILURE);
}
else { /* valid conversion - in range of int */
arr[nval++] = tmp; /* add value to array */
if (*endptr && *endptr != '\n') /* if chars remain */
p = endptr; /* update p to endptr */
else /* otherwise */
break; /* bail */
}
}
if (nval == first_num) /* are all values filled? */
break;
}
Now let's unpack this a bit. The first thing that occurs is you declare the pointers needed to work with strtol
and assign the starting address of buf
which you fill with fgets
to p
and then read a line from your file. There is no need to attempt conversion on a blank line, so we test the first character in buf
and if it is a '\n'
we get the next line with:
...
if (*p == '\n') /* skip blank lines */
continue;
...
Once you have a non-empty line, you start your conversion loop and will attempt conversions until the number of values you have equals first_num
or you reach the end of the line. Your loop control is simply:
while (nval < first_num) { /* loop until nval == first_num */
...
}
Within the loop you will fully validate your attempted conversions with strtol
by resetting errno = 0;
before each conversion and assigning the return of the conversion to a temporary long int
value. (e.g. string-to-long), e.g.
errno = 0; /* reset errno for each conversion */
long tmp = strtol (p, &endptr, 0); /* call strtol */
Once you make the conversion, you have three conditions to validate before you have a good integer conversion,
- if NO digits were converted, then
p == endptr
(and per the man page the return is set to zero). So to check whether this condition occurred, you can check: if (p == endptr && tmp == 0)
;
- if there was an error during conversion of digits, regardless of which error occurred,
errno
will be set to a non-zero value allowing you to check for an error in conversion with if (errno)
. You can also further dive into which occurred as specified in the man page, but for validation purposes here it is enough to know whether an error occurred; and finally
- if digits were converted and there was no error, you are still not done. The
strtol
conversion is to a value of long
that may or may not be compatible with int
(e.g. long
is 8-bytes
on x86_64 while int
is 4-bytes
. So to ensure the converted value will fit in your integer array, you need to check that the value returned is within INT_MIN
and INT_MAX
before you assign the value to an element of arr
.
(note: with 1.
above, just because no digits were converted does not mean there were no digits in the line, it just means the first value was not a digit. You should scan forward in the line using your pointer to find the next +/-
or [0-9]
to determine in further numeric values exist. That is the purpose of the while
loop within that code block)
Once you have a good integer value, recall that endptr
will be set to the next character after the last digit converted. A quick check whether *endptr
is not the nul-terminating character and not the line-ending will tell you whether charters remain that are available for conversion. If so, simply update p = endptr
so that your pointer now points one past the last digit converted and repeat. (you can also scan forward at this point with the same while
loop used above to determine if another numeric value exists -- this is left to you)
Once the loop completes, all you need do is check if nval == first_num
to know if you need to continue collecting values.
Putting it altogether, you could do something similar to:
#include <stdio.h>
#include <stdlib.h> /* for malloc/free & EXIT_FAILURE */
#include <errno.h> /* for strtol validation */
#include <limits.h> /* for INT_MIN/INT_MAX */
#define MAXC 1024 /* don't skimp on buffer size */
int main (int argc, char **argv) {
int first_num, /* your first_num */
*arr = NULL, /* a pointer to block to fill with int values */
nval = 0; /* the number of values converted */
char buf[MAXC]; /* buffer to hold subsequent lines read */
/* open file passed as 1st argument (default: stdin if no argument) */
FILE *fp = argc > 1 ? fopen (argv[1], "r"): stdin;
if (!fp) { /* validate file open for reading */
perror ("fopen-file");
exit (EXIT_FAILURE);
}
if (fscanf (fp, "%d", &first_num) != 1) { /* read/validate int */
fputs ("error: invalid file format, integer not first.\n", stderr);
exit (EXIT_FAILURE);
}
if (!fgets (buf, MAXC, fp)) { /* read/discard '\n' */
fputs ("error: non-POSIX ending after 1st integer.\n", stderr);
exit (EXIT_FAILURE);
}
/* allocate/validate storage for first_num integers */
if (!(arr = malloc (first_num * sizeof *arr))) {
perror ("malloc-arr");
exit (EXIT_FAILURE);
}
while (fgets (buf, MAXC, fp)) { /* read lines until conversions made */
char *p = buf, /* nptr & endptr for strtol conversion */
*endptr;
if (*p == '\n') /* skip blank lines */
continue;
while (nval < first_num) { /* loop until nval == first_num */
errno = 0; /* reset errno for each conversion */
long tmp = strtol (p, &endptr, 0); /* call strtol */
if (p == endptr && tmp == 0) { /* validate digits converted */
/* no digits converted - scan forward to next +/- or [0-9] */
do
p++;
while (*p && *p != '+' && *p != '-' &&
( *p < '0' || '9' < *p));
if (*p) /* valid start of numeric sequence? */
continue; /* go attempt next conversion */
else
break; /* go read next line */
}
else if (errno) { /* validate successful conversion */
fputs ("error: overflow/underflow in conversion.\n", stderr);
exit (EXIT_FAILURE);
}
else if (tmp < INT_MIN || INT_MAX < tmp) { /* validate int */
fputs ("error: value exceeds range of 'int'.\n", stderr);
exit (EXIT_FAILURE);
}
else { /* valid conversion - in range of int */
arr[nval++] = tmp; /* add value to array */
if (*endptr && *endptr != '\n') /* if chars remain */
p = endptr; /* update p to endptr */
else /* otherwise */
break; /* bail */
}
}
if (nval == first_num) /* are all values filled? */
break;
}
if (nval < first_num) { /* validate required integers found */
fputs ("error: EOF before all integers read.\n", stderr);
exit (EXIT_FAILURE);
}
for (int i = 0; i < nval; i++) /* loop outputting each integer */
printf ("arr[%2d] : %d\n", i, arr[i]);
free (arr); /* don't forget to free the memory you allocate */
if (fp != stdin) /* and close any file streams you have opened */
fclose (fp);
return 0;
}
(note: the final check of if (nval < first_num)
after exiting the read and conversion loop)
Example Use/Output
With your example file, you would get the following:
$ ./bin/fgets_int_file dat/int_file.txt
arr[ 0] : 1
arr[ 1] : 78
arr[ 2] : 45
arr[ 3] : 32
arr[ 4] : 2
Why Go To The Extra Trouble?
By thoroughly understanding the conversion process and going to the few additional lines of trouble, you end up with a routine that can provide flexible input handling for whatever number of integers you need regardless of the input file format. Let's look at another variation of your input file:
A More Challenging Input File
$ cat dat/int_file2.txt
5
1 78
45
32 2 144 91 270
foo
What changes are needed to handle retrieving the same first five integer values from this file? (hint: none - try it)
An Even More Challenging Input File
What if we up the ante again?
$ cat dat/int_file3.txt
5
1 two buckle my shoe, 78 close the gate
45 is half of ninety
foo bar
32 is sixteen times 2 and 144 is a gross, 91 is not prime and 270 caliber
baz
What changes are needed to read the first 5 integer values from this file? (hint: none)
But I Want To Specify the Line To Start Reading From
OK, let's take another input file to go along with the example. Say:
An Example Input Reading From A Given Line
$ cat dat/int_file4.txt
5
1,2 buckle my shoe, 7,8 close the gate
45 is half of ninety
foo bar
32 is sixteen times 2 and 144 is a gross, 91 is not prime and 270 caliber
baz
1 78 45 32 2 27 41 39 1111
a quick brown fox jumps over the lazy dog
What would I have to change? The only changes needed are changes to skip the first 10
lines and begin your conversion loop at line 11
. To do that you would need to add a variable to hold the value of the line to start reading integers on (say rdstart
) and a variable to hold the line count so we know when to start reading (say linecnt
), e.g.
int first_num,
*arr = NULL,
nval = 0,
rdstart = argc > 2 ? strtol(argv[2], NULL, 0) : 2,
linecnt = 1;
(note: the line to start the integer read from is taken as the 2nd argument to the program or a default of line 2
is used if none is specified -- and yes, you should apply the same full validations to this use of strtol
, but that I leave to you)
What else needs changing? Not much. Instead of simply reading and discarding the '\n'
left by fscanf
, just do that linecnt-1
times (or just linecnt
time since you initialized linecnt = 1;
). To accomplish that, simply wrap your first call to fgets
in a loop (and change the error message to make sense), e.g.
while (linecnt < rdstart) { /* loop until linecnt == rdstart */
if (!fgets (buf, MAXC, fp)) { /* read/discard line */
fputs ("error: less than requested no. of lines.\n", stderr);
exit (EXIT_FAILURE);
}
linecnt++; /* increment linecnt */
}
That's it. (and note it will continue to handle the first 3 input files as well just by omitting the second parameter...)
Example Output Start At Line 11
Does it work?
$ ./bin/fgets_int_file_line dat/int_file4.txt 11
arr[ 0] : 1
arr[ 1] : 78
arr[ 2] : 45
arr[ 3] : 32
arr[ 4] : 2
Look things over and let me know if you have further questions. There are many ways to do this, but by far, if you learn how to use strtol
(all the strtoX
functions work very much the same), you will be well ahead of the game in handling numeric conversion.