You have started off on the wrong foot. See Why is while ( !feof (file) ) always wrong?. While there are a number of ways to separate the information into id, name, score
, probably the most basic is to simply read an entire line of data into a temporary buffer (character array), and then to use sscanf
to separate id, name & score
.
The parsing with sscanf
is not difficult, the only caveat being that your name
can contain whitespace, so you cannot simply use "%s"
as the format specifier to extract the name. This is mitigated by your score
field always starting with a digit and digits do not occur in names (there are always exceptions to the rule -- and it can be handled with a simple parse with a pair of pointers, but for the basic example we will make this formatting assumption)
To make data handling simpler and be able to coordinate all the information for one student as a single object (allowing you to create an array of them to hold all student information) you can use a simple stuct
. Declaring a few constants to set the sizes for everything avoids using Magic-Numbers throughout your code. (though for the sscanf
field-width modifiers, actual numbers must be used as you cannot use constants or variables for the width modifier) For example, your struct could be:
#define MAXID 8 /* if you need a constant, #define one (or more) */
#define MAXNM 64
#define MAXSTD 128
#define MAXLN MAXSTD
typedef struct { /* simple struct to hold student data */
char id[MAXID];
char name[MAXNM];
double score;
} student_t;
(and POSIX reserves the "_t"
suffix for extension of types, but there won't be a "student_t"
type -- but in general be aware of the restriction though you will see the "_t"
suffix frequently)
The basic approach is to read a line from your file into a buffer (with either fgets
or POSIX getline
) and then pass the line to sscanf
. You condition your read loop on the successful read of each line so your read stops when EOF
is reached. For separating the values with sscanf
, it is convenient to use a temporary struct to hold the separated values. That way if the separation is successful, you simply add the temporary struct to your array. To read the students into an array of student_t
you could do:
size_t readstudents (FILE *fp, student_t *s)
{
char buf[MAXLN]; /* temporary array (buffer) to hold line */
size_t n = 0; /* number of students read from file */
/* read each line in file until file read or array full */
while (n < MAXSTD && fgets (buf, MAXLN, fp)) {
student_t tmp = { .id = "" }; /* temporary stuct to fill */
/* extract id, name and score from line, validate */
if (sscanf (buf, "%7s %63[^0-9] %lf", tmp.id, tmp.name, &tmp.score) == 3) {
char *p = strrchr (tmp.name, 0); /* pointer to end of name */
/* backup overwriting trailing spaces with nul-terminating char */
while (p && --p >= tmp.name && *p == ' ')
*p = 0;
s[n++] = tmp; /* add temp struct to array, increment count */
}
}
return n; /* return number of students read from file */
}
Now let's take a minute and look at the sscanf
format string used:
sscanf (buf, "%7s %63[^0-9] %lf", tmp.id, tmp.name, &tmp.score)
Above, with the line in buf
, the format string used is "%7s %63[^0-9] %lf"
. Each character array type uses a field-width modifier to limit the number of characters stored in the associated array to one-less-than the number of characters available. This protects the array bounds and ensures that each string stored is nul-terminated. The "%7s"
is self-explanatory - read at most 7-characters into what will be the id
.
The next conversion specifier for the name is "%63[^0-9]"
which is a bit more involved as it uses the "%[...]
character class conversion specifier with the match inverted by use of '^'
as the first character. The characters in the class being digits 0-9
, the conversion specifier reads up to 63 character that do Not include digits. This will have the side-effect of including the spaces between name
and score
in name
. Thankfully they are simple enough to remove by getting a pointer to the end of the string with strrchr (tmp.name, 0);
and then backing up checking if the character is a ' '
(space) and overwriting it with a nul-terminating character (e.g. '\0'
or numeric equivalent 0
).
The last part of the sscanf
conversion, "%lf"
is simply the conversion specifier for the double
value for score
.
Note: most importantly, the conversion is validated by checking the return of the call to sscanf
is 3
-- the number of conversions requested. If all conversions succeed into the temporary struct tmp
, then tmp
is simply added to your array of struct.
To call the function from main()
and read the student information, you simply declare an array of student_t
to hold the information, open and validate your data file is open for reading, and make a call to readstudents
capturing the return to validate that student information was actually read from the file. Then you can make use of the data as you wish (it is simply output below):
int main (int argc, char **argv) {
student_t students[MAXSTD] = {{ .id = "" }}; /* array of students */
size_t nstudents = 0; /* count of students */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
/* read students from file, validate return, if zero, handle error */
if ((nstudents = readstudents (fp, students)) == 0) {
fputs ("error: no students read from file.\n", stderr);
return 1;
}
if (fp != stdin) /* close file if not stdin */
fclose (fp);
for (size_t i = 0; i < nstudents; i++) /* output each student data */
printf ("%-8s %-24s %g\n",
students[i].id, students[i].name, students[i].score);
return 0;
}
All that remains is including the required headers, stdio.h
and string.h
and testing:
Example Use/Output
$ ./bin/read_stud_id_name_score dat/stud_id_name_no.txt
0001 William Bob 8.5
0034 Howard Stark 9.5
0069 Natalia Long Young 8
It works as needed.
Note, this is the most basic way of separating the values and only works based on the assumption that your score
field starts with a digit.
You can eliminate that assumption by manually parsing the information you need by reading each line in the same manner, but instead of using sscanf
, simply declare a pair of pointers to isolate id, name & score
manually. The basic approach being to advance a pointer to the first whitespace and read id
, skip the following whitespace and position the pointer at the beginning of name
. Start from the end of the line with the other and backup to the first whitespace at the end and read score
, then continue backing up positioning the pointer in the first space after name
. Then just copy the characters between your start and end pointer to name
and nul-terminate. It is more involved from a pointer-arithmetic standpoint, but just as simple. (that is left to you)
Look things over and let me know if you have further questions. Normally, you would dynamically declare your array of students and allocate/reallocate as needed to handle any number of students from the file. (or from an actual C++ standpoint use the vector
and string
types that the standard template library provides and let the containers handle the memory allocation for you) That too is just one additional layer that you can add to add flexibility to your code.
C++ Implementation
I apologize for glossing over a C++ solution, but given your use of C string functions in your posted code, I provided a C solution in return. A C++ solution making using the std::string
and std::vector
is not that much different other than from a storage standpoint. The parsing of the three values is slightly different, where the entire line is read into id
and name
and then the score
is obtained from the portion of the line held in name
and then those characters erased from name
.
Changing the C FILE*
to std::ifstream
and the array of student_t
to a std::vector<student_t>
, your readstudents()
function could be written as:
void readstudents (std::ifstream& fp, std::vector<student_t>& s)
{
std::string buf; /* temporary array (buffer) to hold line */
student_t tmp; /* temporary stuct to fill */
/* read each line in file until file read or array full */
while (fp >> tmp.id && getline(fp, tmp.name)) {
/* get offset to beginning digit within tmp.name */
size_t offset = tmp.name.find_first_of("0123456789"),
nchr; /* no. of chars converted with stod */
if (offset == std::string::npos) /* validate digit found */
continue;
/* convert to double, save in tmp.score */
tmp.score = std::stod(tmp.name.substr(offset), &nchr);
if (!nchr) /* validate digits converted */
continue;
/* backup using offset to erase spaces after name */
while (tmp.name.at(--offset) == ' ')
tmp.name.erase(offset);
s.push_back(tmp); /* add temporary struct to vector */
}
}
(note: the return type is changed to void
as the .size()
of the student vector can be validated on return).
The complete example would be:
#include <iostream>
#include <iomanip>
#include <fstream>
#include <string>
#include <vector>
struct student_t { /* simple struct to hold student data */
std::string id;
std::string name;
double score;
};
void readstudents (std::ifstream& fp, std::vector<student_t>& s)
{
std::string buf; /* temporary array (buffer) to hold line */
student_t tmp; /* temporary stuct to fill */
/* read each line in file until file read or array full */
while (fp >> tmp.id && getline(fp, tmp.name)) {
/* get offset to beginning digit within tmp.name */
size_t offset = tmp.name.find_first_of("0123456789"),
nchr; /* no. of chars converted with stod */
if (offset == std::string::npos) /* validate digit found */
continue;
/* convert to double, save in tmp.score */
tmp.score = std::stod(tmp.name.substr(offset), &nchr);
if (!nchr) /* validate digits converted */
continue;
/* backup using offset to erase spaces after name */
while (tmp.name.at(--offset) == ' ')
tmp.name.erase(offset);
s.push_back(tmp); /* add temporary struct to vector */
}
}
int main (int argc, char **argv) {
std::vector<student_t> students {}; /* array of students */
if (argc < 2) { /* validate one argument given for filename */
std::cerr << "error: filename required as 1st argument.\n";
return 1;
}
std::ifstream fp (argv[1]); /* use filename provided as 1st argument */
if (!fp.good()) { /* validate file open for reading */
std::cerr << "file open failed";
return 1;
}
/* read students from file, validate return, if zero, handle error */
readstudents (fp, students);
if (students.size() == 0) {
std::cerr << "error: no students read from file.\n";
return 1;
}
for (auto s : students) /* output each student data */
std::cout << std::left << std::setw(8) << s.id
<< std::left << std::setw(24) << s.name
<< s.score << '\n';
}
(the output is the same -- aside from 2-spaces omitted between the values)
Look things over and let me know if you have questions.