1

I am reading from a text file which is formatted like so:

Firstname Surname Age NumberOfSiblings motherage dadage

Importing header files:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>

A struct is defined as follows:

typedef struct {
    int person_ID; //not included in file
    char full_name[20];
    char sex[2];
    char countryOfOrigin[20];
    int num_siblings;
    float parentsAges[2]; //this should store mother and fathers age in an array of type float
} PersonalInfo;


void viewAllPersonalInformation(){
    FILE* file = fopen("People.txt", "r");
    if (file == NULL){
        printf("File does not exist");
        return;
    }
    int fileIsRead = 0;
    int idCounter = 0;

    PersonalInfo People[1000];
    //headers
    printf("%2s |%20s |%2s |%10s |%2s |%3s |%3s\n", "ID", "Name", "Sex", "Born In", "Number of siblings", "Mother's age", "Father's Age");

    do{
        fileIsRead = fscanf(file, "%s %s %s %d %f %f\n", People[idCounter].full_name, People[idCounter].sex, People[idCounter].countryOfOrigin, &People[idCounter].num_siblings, &People[idCounter].parentsAges[0], &People[idCounter].parentsAges[1]);

        People[idCounter].person_ID = idCounter;
        printf("%d %s %s %s %d %f %f\n", People[idCounter].person_ID, People[idCounter].full_name, People[idCounter].sex, People[idCounter].countryOfOrigin, People[idCounter].num_siblings, People[idCounter].parentsAges[0], People[idCounter].parentsAges[1]);
        idCounter++;
    }
    while(fileIsRead != EOF);
    fclose(file);


    printf("Finished reading file");
}


int main() {
    viewAllPersonalInformation();
    return 0;
}

Where People.txt looks like:

John O'Donnell F Ireland 3 32.5 36.1

Mary Mc Mahon M England 0 70 75

Peter Thompson F America 2 51 60

gsamaras
  • 71,951
  • 46
  • 188
  • 305
user2363025
  • 6,365
  • 19
  • 48
  • 89
  • Can you please share a minimal example? – gsamaras Dec 08 '19 at 17:56
  • 1
    The pointers you pass to `fscanf` are read from uninitialised memory. – Siguza Dec 08 '19 at 17:58
  • @gsamaras sorry I don't fulled understand a minimal example? I need to loop through the file, assign data to structs and print. I tried using fscanf to grab the values and printf to print them. Not sure what to leave out? – user2363025 Dec 08 '19 at 17:58
  • 1
    @user2363025 a complete minimal example, with your main, your struct, the way you allocate memory for your pointers before you read the data, and the way you read the data. By what you have in your question right now, one could **jump to the conclusion that you forgot to allocate memory for your pointers**. Is that the case? A complete minimal example would leave no doubt, and no need for hypotheses ;) – gsamaras Dec 08 '19 at 18:01
  • 1
    @gsamaras you did jump in the right direction :) I need to create the memory dynamically but just as a starting point I've amended the struct so that fullname and country are fixed length 20 and the float array is fixed length 2. But the program is still crashing on the fileread line – user2363025 Dec 08 '19 at 18:12
  • @user2363025 You expect to read two strings with the format specifier `%s`, in the case of full name. `%s` stops as soon as a whitespace is found, so it will only store the first name in `full_name`, and the surname will go to the 2nd `%s`, thus in `countryOfOrigin`. Please update your question with a **Minimal Complete Reproducible Example**, an example that I could just copy paste in my IDE and try it out. Now if I copy paste your code, it won't compile, since it's incomplete (there is no main method for instance). – gsamaras Dec 08 '19 at 19:34
  • @gsamaras I've added in the fixed lengths to the struct definition now. Again I will have to deal with dynamic memory ultimately, but I'd be happy to get it going with fixed lengths first and then refine later. Added the main definition which just calls this function for now. I added in another member variable to the struct. It comes after the name and before country of origin. It is always a single character of F or M, perhaps this is key in figuring out where name ends? – user2363025 Dec 08 '19 at 23:46
  • OT: regarding: `typedef struct {` most debuggers cannot access the individual fields without a 'tag' name for the struct. this greatly hinders the debugging operation – user3629249 Dec 09 '19 at 06:00
  • OT: regarding: `fileIsRead = fscanf(file, "%s %s %s %d %f %f\n", ... );` 1) always check the returned value to assure the operation was successful. `fscanf()` returns EOF or the number of successful 'input format conversion specifiers' I.E. any returned value other than `6` indicates an error occurred. 2) when using the specifiers `%s` and/or `%[...]` always include a MAX_CHARACTERS modifier that is 1 less than the length of the input buffer because those specifiers always append a NUL byte to the input. This also avoid any buffer overflow and the attendant undefined behavior – user3629249 Dec 09 '19 at 06:11
  • regarding the struct field: `char sex[2];` There is NOTHING in the example record layout (Firstname Surname Age NumberOfSiblings motherage dadage) that indicates if the person is male or female. so this field should either be removed or left blank – user3629249 Dec 09 '19 at 06:14
  • regarding: `fscanf(file, "%s %s %s %d %f %f\n", People[idCounter].full_name,...` There is no (simple) way to just read the persons full name Suggest: `char firstName[20]; char lastName[20]; fscanf(file, "%19s %19s ... \n", firstName, lastName, ... ); strcpy( Persons[id].fullname, firstName ); strcat( Person[ID].fullname, " " ); strcat( Person[ID].fullname, lastName );` – user3629249 Dec 09 '19 at 06:25
  • regarding: `printf("%2s |%20s |%2s |%10s |%2s |%3s |%3s\n", "ID", "Name", "Sex", "Born In", "Number of siblings", "Mother's age", "Father's Age");` the column header: `SEX` will take 3 characters but the format string only allows for 2 characters. the column header: `born in` is for a field that can be up to 19 characters long, so the `%10s` will not work, suggest: `%20s` the fullname field is only 20 chars wide, so a name like: alphrado batchickenson will not work. Other fields have similar problems – user3629249 Dec 09 '19 at 06:40
  • what happens if there are more than 99 records in the file? – user3629249 Dec 09 '19 at 06:44
  • regarding: `do { fscanf( ... ); printf( ... ) } while(fileIsRead != EOF);` if the call to `fscanf()` returns EOF, then what will be printed is garbage (probably a copy of the prior call to `fscanf()`. Suggest: `while( fscanf( ... ) == 7 ) { ... printf() }` Note: `7` after fixing the reading of the first and last names – user3629249 Dec 09 '19 at 06:48
  • regarding: `printf("Finished reading file");` This will stay in the output stream buffer until the program exits, then it will be displayed on the terminal. Suggest: `printf("Finished reading file\n");` the trailing '\n' will force the data to be output to the terminal immediately – user3629249 Dec 09 '19 at 06:53

2 Answers2

1

fscanf() will stop reading when a whitespace is met. You expect to read two strings with the format specifier %s, in the case of full name. %s stops as soon as a whitespace is found, so it will only store the first name in full_name, and the surname will go to the second %s, thus in countryOfOrigin.

So, if you want to read "Peter Thompson", then you would need to introduce two strings (char arrays) to store the first name and the last name, and then concatenate them.

However, since you want to read full names that vary in number of words, I suggest you use fgets() (which also has buffer overflow protection). For example "Peter Thompson" has 2 and "Mary Mc Mahon" has 3. So, if you'd stick with fscanf(), how many %s would you use? 2 or 3? You don't know, it depends on the input, which you get on runtime. Maybe there is some regex to do the trick with fscanf(), but believe that using fgets() and then parsing the line of the file read is better for practice.


Now that we read a line of file with fgets(), what do we do with that? We still don't know the number of words each full name consists of! How to find out? By counting the whitespaces the line contains. If it contains w whitespaces, then it has w + 1 tokens (could be words, numbers or characters in your example).

With a simple if-else statement, we can differentiate between these two scenarios in your example, when there are 6 spaces (7 tokens) and 7 spaces (8 tokens for "Mary Mc Mahon M England 0 70 75").

Now, how to extract from the string (the line) to the tokens (full name, age and so on)? We could have a loop and use a bunch of if-else statements to say, until I found the 2nd (or 3rd depending on the number of whitespaces) spaces, I am going to append the current token to the full_name. Then, next token will be the sex, and so on.

Sure you could do that, but since I am bit lazy, I will just base myself on your good work with fscanf(), and use sscanf() instead, to extract the tokens. Of course with this approach, we need to use one or two (depending on the number of spaces) extra strings, in order to temporarily store the surname (before we append it to the name with strcat()).

Minimal Complete Working Example:

#include <stdio.h>
#include <string.h>

#define P 1000 // Max number of people
#define L 256  // Max length of line read from file (-1)

typedef struct {
    int person_ID; //not included in file
    char full_name[32];
    char sex[2];
    char countryOfOrigin[16];
    int num_siblings;
    float parentsAges[2];
} PersonalInfo;

int count_whitespaces(char* str)
{
    int whitespaces_count = 0;
    while(*str)
    {
        if(*str == ' ')
            whitespaces_count++;
        str++;
    }
    return whitespaces_count;
}

void viewAllPersonalInformation(){
    FILE* file = fopen("People.txt", "r");
    if (file == NULL){
        printf("File does not exist");
        return;
    }
    int fileIsRead = 0;
    int idCounter = 0;

    PersonalInfo People[P];
    // line of file, placeholder for biworded surnames, surname.
    char line[L], str[8], surname[16];
    //headers
    // You have 7 format specifiers for the headers, but only 6 six in fscanf!!!
    printf("%2s |%5s |%2s |%10s |%2s |%3s |%3s\n", "ID", "Name", "Sex", "Born In", "Number of siblings", "Mother's age", "Father's Age");

    // read into 'line', from 'file', up to 255 characters (+1 for the NULL terminator)
    while(fgets(line, L, file) != NULL) {
        //fileIsRead = fscanf(file, "%s %s %s %s %d %f %f\n", People[idCounter].full_name, People[idCounter].full_name, People[idCounter].sex, People[idCounter].countryOfOrigin, &People[idCounter].num_siblings, &People[idCounter].parentsAges[0], &People[idCounter].parentsAges[1]);
        // eat trailing newline of fgets
        line[strcspn(line, "\n")] = 0;

        // Skip empty lines of file
        if(strlen(line) == 0)
            continue;

        if(count_whitespaces(line) == 6)
        {
            sscanf(line, "%32s %16s %c %16s %d %f %f", People[idCounter].full_name, surname, People[idCounter].sex, People[idCounter].countryOfOrigin, &People[idCounter].num_siblings, &People[idCounter].parentsAges[0], &People[idCounter].parentsAges[1]);
        }
        else // 7 whitespaces, thus 8 token in the string
        {
            sscanf(line, "%32s %8s %16s %c %16s %d %f %f", People[idCounter].full_name, str, surname, People[idCounter].sex, People[idCounter].countryOfOrigin, &People[idCounter].num_siblings, &People[idCounter].parentsAges[0], &People[idCounter].parentsAges[1]);
            // Separate name and first word of surname with a space
            strcat(People[idCounter].full_name, " ");
            strcat(People[idCounter].full_name, str);
        }

        // Separate name and surname with a space
        strcat(People[idCounter].full_name, " ");
        strcat(People[idCounter].full_name, surname);

        People[idCounter].person_ID = idCounter;
        printf("%d %s %s %s %d %f %f\n", People[idCounter].person_ID, People[idCounter].full_name, People[idCounter].sex, People[idCounter].countryOfOrigin, People[idCounter].num_siblings, People[idCounter].parentsAges[0], People[idCounter].parentsAges[1]);
        idCounter++;
        if(idCounter == P)
        {
            printf("Max number of people read, stop reading any more data.\n");
            break;
        }
    };
    fclose(file);

    printf("Finished reading file.\n");
}


int main() {
    viewAllPersonalInformation();
    return 0;
}

Output:

ID | Name |Sex |   Born In |Number of siblings |Mother's age |Father's Age
0 John O'Donnell F Ireland 3 32.500000 36.099998
1 Mary Mc Mahon M England 0 70.000000 75.000000
2 Peter Thompson F America 2 51.000000 60.000000
Finished reading file.

Did you notice the numbers in the format specifiers of sscanf()? They are guarding from buffer overflows.


What about Dynamic Memory Allocation?

In the code above, I estimated the maximum length of name, country of origin and such. Now how about having those sizes dynamic? We could, but we would still need an initial estimation.

So, we could read the name in a temporary array of fixed length, and then find the actual length of the string with strlen(). With that information in hand, we are now able to dynamically allocate memory (pointing by a char pointer), and then copy with strcpy() the string from the temp array to its final destination.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
0

If you have a pointer field char *full_name, it means just a pointer which should be initialized by some existent object, in case of char * it usually should be an array of char. You may fix it in two ways:

  • Just make an array field like char full_name[100], and pass a maximum length of a string to a scanf format string, like %100s, and this is the simplest way;
  • Use a malloc function, and don't forget to free that address, or assign some valid address to a pointer in some other way, e.g. declare an array as a usual autostorage variable, and assign an address of zero-index element to your pointer field, and remember that after leaving your function the address of your autostorage variable will become invalid.

There is another trouble. The %s conversion specifier tells fscanf to read a single word until any whitespace character like a space, so according to your input format your full_name field will be read until the first space, and any fürther attempt to read an integer will fail.

Northsoft
  • 171
  • 4