1

So, I am currently working on System programming for my Unix OS class. All that this program should do is read a binary file and output the lines to a CSV file. I feel like i'm almost done but for some reason I keep getting a segfault.

To clarify: fd1 = input file, fd2 = output file, numrecs = number of records from input file. Somewhere in main():

for(i=0;i<numrecs;i++){
    if((bin2csv(fd1, fd2)) == -1){
        printf("Error converting data.\n");
    }
}

int bin2csv(fd1, fd2){
    bin_record rec;
    char buffer[100];
    int buflen;
    strncpy(buffer,"\0", 100); /* fill buffer with NULL */
    recs = &rec;

    /* read in a record */
    if((buflen = read(fd1, &recs, sizeof(recs))) < 0){
        printf("Fatal Error: Data could not be read.\n");        
        return -1;
    }

   sprintf(buffer, "%d, %s, %s, %f, %d\n", recs->id, recs->lname, recs->fname, recs->gpa, recs->iq);
   printf("%s\n", buffer);
   write(fd2, buffer, sizeof(buffer));
   return 0;
}

The segfault is occurring on the line "sprintf(buffer, etc..);" however, I cannot figure out why that is happening.

This is the error gdb spits out:

Program received signal SIGSEGV, Segmentation fault.
0x0000000100000c87 in bin2csv (fd1=3, fd2=4) at bin2csv.c:25
25 sprintf(buffer, "%d, %s, %s, %f, %d\n", recs->id, recs->lname,
recs->fname, recs->gpa, recs->iq);

Hopefully this is enough info. Thanks!

Himanshu
  • 4,327
  • 16
  • 31
  • 39
  • 1
    Segmentation faults are caused by bad memory accesses, and since the only memory accesses on that line are `recs->...`, you can assume recs is not being properly initialized. Check if it's `NULL` right before, and if not, keep digging. – Ricky Mutschlechner Dec 16 '15 at 06:43
  • Post definition of `bin_record`. – chux - Reinstate Monica Dec 16 '15 at 15:50
  • `strncpy(buffer,"\0", 100); /* fill buffer with NULL */` That will not fill the buffer with **NUL** characters as `strncpy()` copies a *maximum* number of bytes, but still stops copying when the first NUL character is encountered in the source. – Andrew Henle Dec 16 '15 at 16:03
  • this line: `int bin2csv(fd1, fd2){` will not compile. Perhaps you meant: (if using `open()`) `int bin2csv( int fd1, int fd2 )` – user3629249 Dec 16 '15 at 19:41
  • regarding this line: `strncpy(buffer,"\0", 100); ` this will copy up to 0 bytes, then append a '\0' probably not what you want. Suggest: `memset( buffer, '\0', sizeof(buffer) );` – user3629249 Dec 16 '15 at 19:44
  • regarding this line: `recs = &rec;`, rec is defined as a struct , not as a pointer, and this line would ask it to point to itself, probably not what you want, suggest removing this line. – user3629249 Dec 16 '15 at 19:46
  • after this line: `if((buflen = read(fd1, &recs, sizeof(recs))) < 0){` and before using the contents of the `recs` to fill the buffer[] array, need to check for `buflen==0` (I.E. EOF) and for `buflen < sizeof(recs)` (not enough characters in the file to read a complete `recs`. – user3629249 Dec 16 '15 at 19:50
  • unless the fields: `recs->lname, recs->fname` are a fixed size, cannot be sure that whole record is read (and not part of the next record) If not a fixed length for those two fields, strongly suggest reading a single character at a time and parsing the input (or use `fscanf()`, but that will have its' own problems) – user3629249 Dec 16 '15 at 19:56
  • regarding this line: `write(fd2, buffer, sizeof(buffer));` there is no guarantee that the buffer[] array is exactly full (it could be less than full or even overfilled (resulting in undefined behaviour). What is really needed is the returned value from `sprintf()`, which the code fails to save, as the length (3rd) parameter. – user3629249 Dec 16 '15 at 20:01
  • regarding these two fields: `recs->lname, recs->fname` if those fields are not NUL terminated (and padded for short names) in the source file the sprintf() will crash. – user3629249 Dec 16 '15 at 20:07
  • 1
    @AndrewHenle That was my gut reaction too, because this is a non-standard way to zero a buffer, but `strncpy` will zero-pad the remainder of the buffer until `n` bytes have been written. – paddy Dec 16 '15 at 21:07
  • @user3629249 You have flooded the comments with points that are better suited to an answer. In addition, only a couple of the points you made are actually valid. – paddy Dec 16 '15 at 21:11
  • 1
    @paddy Yes indeed. `strncpy()` NUL pads the target string. Learn something new every day... :) – Andrew Henle Dec 16 '15 at 21:20
  • @paddy, The question does not contain enough detail to assure that any of my comments is the source of the OPs problem. (like some example input records, the format of the `recs` structure, etc) So it is possible that NONE of my comments are the answer. Under such cases, I'm very hesitant to post such comments as an answer. – user3629249 Dec 16 '15 at 22:57
  • when posting a question about a run time problem, post code that cleanly compiles, a sample of the input, the desired output, etc. – user3629249 Dec 16 '15 at 23:01
  • This was my first post so it's my bad if things were not explained very well. I know it's hard to diagnose problems while reading other peoples code, especially just a small portion of it. Thanks for all the feedback though! – Jake Parham Dec 17 '15 at 04:50

4 Answers4

3

It looks like recs is a pointer. You are reading bytes directly into that pointer, like reading a raw memory address from file:

read(fd1, &recs, sizeof(recs))

And then you start using it in the call to sprintf... BOOM!

There is actually no reason to use it at all (is it a global?)... Even though you initialised it by recs = &rec, and assuming you don't trash it, it still will not contain valid address outside of that function. That's because rec is a local variable.

So, just read directly into rec like this:

read(fd1, &rec, sizeof(rec))

And then on the sprintf line, you use rec.id instead of recs->id (etc).

paddy
  • 60,864
  • 6
  • 61
  • 103
  • This fixed the segmentation fault! Thanks for that information. – Jake Parham Dec 16 '15 at 20:35
  • if `recs` is a pointer, then this line: `if((buflen = read(fd1, &recs, sizeof(recs))) < 0){` will fail because the second parameter of `read()` has to be a pointer to the input buffer, not a pointer to a pointer to the input buffer. in the OPs posted code, the variable `recs` is undefined – user3629249 Dec 16 '15 at 22:59
  • Thanks for paraphrasing my answer. – paddy Dec 17 '15 at 00:20
2

I see a few issues here:

  1. sprintf does nothing to prevent writing past the end of the string buffer. In fact it has no knowledge of the length of that buffer (100 bytes in your case). Since you have setup the buffer in the stack, if sprintf over-runs your buffer (which it could do with long first or last names or garbage strings as input) your stack will be corrupted and a seg fault is likely. You may want to consider including logic to ensure that sprintf will not exceed the amount of buffer space you have. Or better yet avoid sprintf altogether (more on that below)

  2. You are not handling end-of-file in the code provided. For end of file, read returns 0. If you pass bad pointers to sprintf, it will fail.

  3. The functions that you are using are the UNIX derived ones (part of POSIX but decidedly low level) that use small integers as file descriptors. I would recommend using the FILE * based ones instead. The I/O functions of interest would be fopen, fclose, fprintf, fwrite, etc. This would eliminate the need to use sprintf.

See this previous question for more information.

Community
  • 1
  • 1
Ken Clement
  • 748
  • 4
  • 13
  • Part of the description of this particular homework was that we were not able to use FILE * or any f*() functions. Otherwise, I would have used those. – Jake Parham Dec 16 '15 at 20:18
  • @Jake Parham, fair enough. I endorse Rahim's suggestion below about using `snprintf` then to protect against buffer overruns. If you do not have to produce re-entrant code in this assignment (likely), I also suggest taking the buffer out of the stack by marking it static. The identifier will still be local to the function. Good luck on your assignment. – Ken Clement Dec 16 '15 at 21:07
1
if((buflen = read(fd1, &recs, sizeof(recs))) < 0){

Use <= 0 rather than < 0, else when the return value is 0, sprintf(buffer ... may seg fault as it tries to de-reference recs->id which has an uninitialized value.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
0

You have some problems: 1) structure of bin_record. It has char[] and it is possible to overflow. 2) in sprintf you cannot set buffer max size. it is better to use snprintf like this:

 sprintf(buffer, 100, "%d, %s, %s, %f, %d\n", recs->id, recs->lname, recs->fname, recs->gpa, recs->iq);

3) to fill buffer with null us this:

memset (buffer,'\0',100);
Rahim Dastar
  • 1,259
  • 1
  • 9
  • 15