2

I am writing a function in C, that is supposed to input information about students and their grades from a file. However, upon running the programme at times the tests show profiler error at fscanf by fscanf, even though the output is as expected.

/*structure for grade*/
struct Grade {
  char subject[51];
  int grade;
};
/*structure for student*/
struct Student {
  char name[21], surname[21];
  int no_grades;
  struct Grade grades[100];
};
int input_students(struct Student *students, int n) {
  int i = 0, no, j, first;
  char name[21], surname[21], subject[51];
  FILE *database = fopen("input.txt", "r");
  if (database == NULL) {
    printf("Error while opening input.txt");
    return 0;
  }
  while (/*i < n && */fscanf(database, "%20s %20s %50s %d\n", name, surname, subject,
                         &no) == 4) {
    first = 1;
    for (j = 0; j < i; j++) {
      if (strcmp(students[j].name, name) == 0 &&
          strcmp(students[j].surname, surname) == 0) {
        first = 0;
        if (students[j].no_grades >= 100) { /*limit for number of grades per student is 100*/
          break;
        }
        strcpy(students[j].grades[students[j].no_grades].subject,
               predmet);
        students[j].grades[students[j].no_grades].grade = no;
        students[j].no_grades++;
        break;
      }
    }
    if (first && i < n) {
      strcpy(students[i].name, name);
      strcpy(students[i].surname, surname);
      students[i].no_grades = 0;
      strcpy(students[i].grades[students[i].no_grades].subject, subject);
      students[i].grades[students[i].no_grades].grade = no;
      students[i].no_grades++;
      i++;
    }
  }
  /*while (fscanf(database, "%20s %20s %50s %d\n", name, surname, subject,
                         &no) == 4) {
    for (j = 0; j < i; j++) {
      if (strcmp(students[j].name, name) == 0 &&
          strcmp(students[j].surname, surname) == 0) {
        if (students[j].no_grades >= 100) {
          break;
        }
        strcpy(students[j].grades[students[j].no_grades].subject,
               subject);
        students[j].grades[students[j].no_grades].grade = no;
        students[j].no_grades++;
        break;
      }
    }
  }*/

  fclose(database);
  return i;
}

/update, comments changing the code to only one loop, same thing happens/ Return value is the number of students whose information was read from the file, while i < n.

The profiler message indicates a mistake in the line with fscanf.

==3928== Invalid write of size 1
==3928== at 0x37154582C7: _IO_vfscanf (in /lib64/libc-2.12.so)
==3928== by 0x371546465A: __isoc99_fscanf (in /lib64/libc-2.12.so)
==3928== by 0x400AD5: input_students (main.c:22)
==3928== by 0x401394: main (main.c:174)
==3928== Address 0x7feff2e00 expected vs actual:
==3928== Expected: stack array "name" of size 21 in frame 2 back from here
==3928== Actual: stack array "surname" of size 21 in frame 2 back from here
==3928== Actual: is 32 before Expected

I can't seem to be able to find what is the cause of this. I have double-checked that each string ends with '\0', I have run the programme through debugger and it stops where it should, I have not seen it going over and working with uninitalised values, and the values it takes are as expected.

I have translated the name of the function and variables as it wasn't originally in English, therefore the problem can not be in the name of the function overlapping with the name of a library function but I will change it here.

Test code. It's an autotest.

int i, j, no_students;
struct Student students[10];
FILE* database = fopen("input.txt", "w");
fputs("Pero Peric Osnove_racunarstva 8", database);
fputc(10, database);
fputs("Suljo Suljic Osnove_racunarstva 9", database);
fputc(10, database);
fputs("Pero Peric Inzenjerska_matematika_1 6", database);
fclose(database);

no_students=input_students(students, 10);
printf("%d\n", no_students);
for (i=0; i<no_students; i++) {
    printf("%s %s ", students[i].name, studenti[i].surname);
    for (j=0; j<students[i].no_grades; j++)
        printf("%s %d ", students[i].grades[j].subject, students[i].grades[j].grade);
    printf("\n");
}

Output for this test.

2
Pero Peric Osnove_racunarstva 8 Inzenjerska_matematika_1 6 
Suljo Suljic Osnove_racunarstva 9

When the test code is run within the main.c, it does not show any compiler errors, nor does debugger show segmentation error.

Example for input.txt, created by this autotest.

Pero Peric Osnove_racunarstva 8
Suljo Suljic Osnove_racunarstva 9
Pero Peric Inzenjerska_matematika_1 6

This would be the information about the profiler, bare in mind I am still only a student

==17000== exp-sgcheck, a stack and global array overrun detector
==17000== NOTE: This is an Experimental-Class Valgrind Tool
==17000== Copyright (C) 2003-2012, and GNU GPL'd, by OpenWorks Ltd et al.
==17000== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==17000== Command: outputP9YqwH
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
H.C.
  • 31
  • 6
  • 2
    Please [edit] your question and create a [mre]. Maybe the compiler does not have a correct prototype for your function `read` when it compiles `main`? This is the name of a standard library function, so it is at least confusing, but may as well lead to errors. Do you get any compiler warnings? If yes, show the warnings in your question. If not, how do you compile your program? Please answer in your question, not in comments. – Bodo Feb 13 '21 at 19:10
  • 1
    What is `predmet`? `strcpy(students[j].grades[students[j].no_grades].subject, predmet);` looks unsafe. – chux - Reinstate Monica Feb 13 '21 at 19:28
  • 1
    It s the subject. I m changing it right now. – H.C. Feb 13 '21 at 19:29
  • 2
    Why are there 2 `while (i < n && fscanf(database, "%20s %20s %50s %d\n", name, surname, subject, &no) == 4) {` loops in `input_students()`? – chux - Reinstate Monica Feb 13 '21 at 19:35
  • As the file is formated as a list, the same student might not be listed consecutively, and the first loop stops when it reaches the capacity, so in the file there might be left some information about students already input, that are expected. Or for example if we have 50 different students, and lets say 51 lines in the file, and the last two are regarding the same student. As long as the number of grades is <100 it is expected that all information about one student will be input, and that would not be true without the second loop. There might be a better solution for that tho. – H.C. Feb 13 '21 at 19:38
  • 1
    Postavljen sadržaj input.txt. - The content of input.txt for this example added. – H.C. Feb 13 '21 at 19:42
  • to be fair to others I will use English and good job in translating, have seen 3 questions with native language today on SO :S. `if (students[j].no_grades == 100)` use `>=100` think what will happen if your file has this `Pero Peric Inzenjerska_matematika_1 106` –  Feb 13 '21 at 19:47
  • That is a very good point, even though it doesn't solve my initial problem. I will be more careful about that in the future. – H.C. Feb 13 '21 at 19:50
  • What happens if you turn off optimizations? – possum Feb 13 '21 at 19:55
  • The first fscanf, and the condition == 4, is supposed to prevent reading beyond the end, and take care if there is a point in file which or beyond which the file is badly formated. – H.C. Feb 13 '21 at 19:59
  • Which optimisation? – H.C. Feb 13 '21 at 20:00
  • `gcc -O0 main.c`, in `gcc` to disable optimization you put `-O0`, `-O1` is slight optimization etc, up to `-O3` –  Feb 13 '21 at 20:00
  • If that is something that has to do with the tests, that is beyond my control. – H.C. Feb 13 '21 at 20:02
  • And if there are no more parameters, or if they are not as expected, by the compiler that is, it will not carry that value, if I'm not mistaken? Therefore that part should be just fine, yet it shows going into uninitialised value. Which would be alright, if not for the fact that sometimes when the autotests are run, they show this profiler error, and each shows the same, while other times they work perfectily fine and there are no warnings. – H.C. Feb 13 '21 at 20:05
  • `fprintf` is probably *safe* to use in this way, but that doesn't mean it's necessarily *correct*; _ie_, when getting a string that's over-length, it will stop reading before buffer overflow, then on the next token, start where it left off. – Neil Feb 13 '21 at 23:51

2 Answers2

2

Ok, here is what I think is happening:

no_students=input_students(students, 10);

so you are expecting 10 students, yet you have 4. So simplicity reason we will have file pointer database pint to the row 0 (instead of address of 0). So you have this:

row  name   surname subject                  grade_cnt
0    Pero   Peric   Osnove_racunarstva       8\n
1    Suljo  Suljic  Osnove_racunarstva       9\n
3    Pero   Peric   Inzenjerska_matematika_1 6<END OF FILE>

NOTE: that on the end of row 3 might be end of file.

so you first loop goes and reads row 0, 1 ... 3 and tries to read row 4. There is no row 4, so it fails. But what might be happening is that database is pointing to row 4 when first loop is done skipping EOF and reading beyond end of file buffer.

There are two ways of testing for this:

  1. input empty line after 4th line
  2. close and reopen file between while() loops.

The other reason that you might be skipping beyond file pointer is this:

fscanf(database, "%20s %20s %50s %d\n", name, surname, subject, &no) == 4)

notice \n after %d what happens if that \n is gone? fscanf() is looking for '\n'.

Try adding empty line to the end of input file.

Or removing '\n'.

fscanf(database, "%20s %20s %50s %d", name, surname, subject, &no) == 4)

Another reason behind this is numbers, %20s:

fscanf(database, "%s %s %s %d", name, surname, subject, &no) == 4)

Please anyone let me know in comments if I am mistaking.

due to the fact that fscanf reads 20 characters for name.

EDIT: Now I am sure that I am right: Read this. scanf and fscanf should be used with caution, in best case should be avoided.

  • But would such a thing have been possible, when there was only one while()? I am asking because the same thing happened then, and now I see a way to do it with only one loop. – H.C. Feb 13 '21 at 20:30
  • 1
    Same thing happens when there's only one loop, I will put up the code. – H.C. Feb 13 '21 at 20:31
  • Input file is not something I have access to, beyond seening what is in it, it is generated by and for the autotests. I have no control over the content in the input.txt. – H.C. Feb 13 '21 at 20:39
  • @H.C try removing '\n' than. –  Feb 13 '21 at 20:41
  • Tried it now, tried it before, doesn't make any difference, it's highly unpredictable whether it shows the error or not. I don't know what to do about this. – H.C. Feb 13 '21 at 20:45
  • on the same line? –  Feb 13 '21 at 20:54
  • No changes what so ever in the results from the profiler – H.C. Feb 13 '21 at 20:56
  • @H.C can you post original code? Code above does not compile you have left `predmet` in one case and `studenti` in another. when I fix those mistakes I cannot reproduce the behavior –  Feb 13 '21 at 21:19
  • @H.C you can add a line to the input file by doing `fputc(10, databse);` after `fputs("Pero Peric Inzenjerska_matematika_1 6", database);`. You know what `fputc()` does, number 10 is putting ASCII 10 into the file. ASCII character 10 == `\n` –  Feb 13 '21 at 21:37
  • 1
    Nitpick: `EOF` character is (probably) a misnomer. See [why is !eof always wrong](https://stackoverflow.com/a/26557243/2472827), "EOF is the response you get from an attempted I/O operation." – Neil Feb 14 '21 at 00:05
  • I think a more clear way is, the file ends before 4 could be read. On all `scanf` functions, whitespace in the format string is skip a variable number of whitespace characters. It is hard to say whether this makes a difference. Removing the subscript puts you in danger of running a buffer overflow, so I wouldn't touch that. – Neil Feb 14 '21 at 01:32
1

Solution was rather simple, instead of using four variables, name, surname, subject and no. It is way simpler to declare a new variable

struct Student s;

and use it for input.

Not only does it solve the problem with fscanf(), but the rest of the code is simpler and easier to read.

As for the reason why it works, that would need further research.

Here is the revisited code

int input_students(struct Student *students, int n) {
  int i = 0, no, j, first;
  char name[21], surname[21], subject[51];
  FILE *database = fopen("input.txt", "r");
  if (database == NULL) {
    printf("Error while opening input.txt");
    return 0;
  }
  while (fscanf(database, "%20s %20s %50s %d\n", s.name, s.surname, s.grades[0].subject,
                         &s.grades[0].grade) == 4) {
    first = 1;
    for (j = 0; j < i; j++) {
      if (strcmp(students[j].name, name) == 0 &&
          strcmp(students[j].surname, surname) == 0) {
        first = 0;
        if (students[j].no_grades >= 100) { /*limit for number of grades per student is 100*/
          break;
        }
        students[j].grades[students[j].no_grades] = s.grades[0];
        students[j].no_grades++;
        break;
      }
    }
    if (first && i < n) {
      students[i] = s;
      i++;
    }

  fclose(database);
  return i;
}
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
H.C.
  • 31
  • 6