2

In the following code I have the array size set to 20. In Valgrind the code tests clean. But as soon as I change the size to 30, it gives me errors (showed further below). The part that confuses me is that I can change the value to 40 and the errors go away. Change it to 50, errors again. Then 60 tests clean and so on. Keeps going like that. So I was hoping someone might be able to explain this to me. Because it's not quite coming clear to me despite my best efforts to wrap my head around it. These errors were hard to pinpoint because the code by all appearances was valid.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct record {
    int number;
    char text[30];
};

int main(int argc, char *argv[])
{
    FILE *file = fopen("testfile.bin", "w+");
    if (ferror(file)) {
        printf("%d: Failed to open file.", ferror(file));
    }

    struct record rec = { 69, "Some testing" };

    fwrite(&rec, sizeof(struct record), 1, file);
    if (ferror(file)) {
        fprintf(stdout,"Error writing file.");
    }

    fflush(file);
    fclose(file);
}

Valgrind errors:

valgrind --leak-check=full --show-leak-kinds=all\
                --track-origins=yes ./fileio
==6675== Memcheck, a memory error detector
==6675== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==6675== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==6675== Command: ./fileio
==6675== 
==6675== Syscall param write(buf) points to uninitialised byte(s)
==6675==    at 0x496A818: write (in /usr/lib/libc-2.28.so)
==6675==    by 0x48FA85C: _IO_file_write@@GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675==    by 0x48F9BBE: new_do_write (in /usr/lib/libc-2.28.so)
==6675==    by 0x48FB9D8: _IO_do_write@@GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675==    by 0x48F9A67: _IO_file_sync@@GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675==    by 0x48EEDB0: fflush (in /usr/lib/libc-2.28.so)
==6675==    by 0x109288: main (fileio.c:24)
==6675==  Address 0x4a452d2 is 34 bytes inside a block of size 4,096 alloc'd
==6675==    at 0x483777F: malloc (vg_replace_malloc.c:299)
==6675==    by 0x48EE790: _IO_file_doallocate (in /usr/lib/libc-2.28.so)
==6675==    by 0x48FCBBF: _IO_doallocbuf (in /usr/lib/libc-2.28.so)
==6675==    by 0x48FBE47: _IO_file_overflow@@GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675==    by 0x48FAF36: _IO_file_xsputn@@GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675==    by 0x48EFBFB: fwrite (in /usr/lib/libc-2.28.so)
==6675==    by 0x10924C: main (fileio.c:19)
==6675==  Uninitialised value was created by a stack allocation
==6675==    at 0x109199: main (fileio.c:11)
==6675== 
==6675== 
==6675== HEAP SUMMARY:
==6675==     in use at exit: 0 bytes in 0 blocks
==6675==   total heap usage: 2 allocs, 2 frees, 4,648 bytes allocated
==6675== 
==6675== All heap blocks were freed -- no leaks are possible
==6675== 
==6675== For counts of detected and suppressed errors, rerun with: -v
==6675== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
lee8oi
  • 1,169
  • 8
  • 12

3 Answers3

4

The problem is that there is padding in the structure to make the int a always aligned by 4 in memory, even within an array of struct records. Now, 20+4 is divisible by 4, and so is 40+4 and 60+4. But 30+4 and 50+4 are not. Hence 2 padding bytes need to be added to make the sizeof (struct record) divisible by 4.

When you're running the code with array size 34, sizeof (struct record) == 36, and bytes 35 and 36 contain indeterminate values - even if the struct record is otherwise fully initialized. What is worse, code that writes indeterminate values can leak sensitive information - the Heartbleed bug being a prime example.

The solution is actually to not write the structure using fwrite. Instead write the members individually - this improves portability too. There isn't much performance difference either, as fwrite buffers the writes and so does fread.


P.S. the road to hell is paved with packed structs, you want to avoid them like plague in generic code.


P.P.S. ferror(file) will almost certainly never be true just after fopen - and in normal failures fopen will return NULL and ferror(NULL) will probably lead to a crash.

3

[edit]

My answer relates to a weakness in OP's code, yet the Valgrind write(buf) points to uninitialized byte(s) is due to other reasons answered by others.


When the open fails, ferror(file) is undefined behavior (UB).

if (ferror(file)) is not the right test for determining open success.

FILE *file = fopen("testfile.bin", "w+");
// if (ferror(file)) {
//    printf("%d: Failed to open file.", ferror(file));
// }
if (file == NULL) {
    printf("Failed to open file.");
    return -1;  // exit code, do not continue
}

I do not see other obvious errors.


ferror(file) is useful to test the result of I/O, not of opening a file.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • @Dacav Perhaps. Yet `if (ferror(NULL))` is _undefined behavior_. Without a good check of `file`, the rest of code is questionable. – chux - Reinstate Monica Mar 31 '19 at 22:37
  • Thanks, I'll remember this in the future and be sure to read more closely about the return values/etc. – lee8oi Mar 31 '19 at 23:04
  • Turns out OP is a giant troll, heh @chux :D – Dacav Mar 31 '19 at 23:10
  • I've changed my code to include this fix too. I realized already it was good suggestion, just noted that it wasn't the answer or solution to the problem I was trying to noobishly wrap my head around. – lee8oi Mar 31 '19 at 23:52
0

I initially misinterpreted the valgrind output, so @chux's deserves acceptance. I'll try to put together the best answer I can though.

Checking errors

The first error (the one I didn't immediately consider) is to check the value returned by fopen(3) with ferror(3). The fopen(3) call returns NULL on error (and sets errno), so checking NULL with ferror(3) is wrong.

Serializing a structure on a file.

With the initialization you write all the fields of your structure, but you don't initialize all the memory it covers. Your compiler might for example leave some padding in the structure, in order to get better performance while accessing data. As you write the whole structure on the file, you are actually passing non-initialized data to the fwrite(3) function.

By changing the size of the array you change Valgrind's behaviour. Probably this is due to the fact that the compiler changes the layout of the structure in memory, and it uses a different padding.

Try wiping the rec variable with memset(&rec, 0, sizeof(rec)); and Valgrind should stop complaining. This will only fix the symptom though: since you are serializing binary data, you should mark struct record with __attribute__((packed)).

Initializing memory

Your original initialization is good.

An alternative way of initializing data is to use strncpy(3). Strncpy will accept as parameters a pointer to the destination to write, a pointer to the source memory chunk (where data should be taken from) and the available write size.

By using strncpy(&rec.text, "hello world", sizeof(rec.text) you write "hello world" over the rec.text buffer. But you should pay attention to the termination of the string: strncpy won't write beyond the given size, and if the source string is longer than that, there won't be any string terminator.

Strncpy can be used safely as follows

strncpy(&rec.text, "hello world", sizeof(rec.text) - 1);
rec.text[sizeof(rec.text) - 1] = '\0';

The first line copies "hello world" to the target string. sizeof(rec.text) - 1 is passed as size, so that we leave room for the \0 terminator, which is written explicitly as last character to cover the case in which sizeof(rec.text) is shorter than "hello world".

Nitpicks

Finally, error notifications should go to stderr, while stdout is for results.

Dacav
  • 13,590
  • 11
  • 60
  • 87
  • But notice that I can change 30 to 40, suddenly it's not sending initialized bytes anymore. Size 50 does, 60 doesn't. – lee8oi Mar 31 '19 at 22:14
  • 1
    `struct record rec = { 69, "Some testing" };` looks like initialized to me. – wildplasser Mar 31 '19 at 22:15
  • Don't focus on that. Undefined behaviours happens when memory is not initialized. – Dacav Mar 31 '19 at 22:15
  • "With the initialization you don't write all the bytes of the text field." and "you are are actually handling non-initialized data." are incorrect. Initialization is either all or nothing, not partial in C. `struct record rec = { 69, "Some testing" };` initializes all of `rec`. OP's mistake is every other build, `file` is `NULL` – chux - Reinstate Monica Mar 31 '19 at 22:15
  • @chux, yes, I should rephrase that. – Dacav Mar 31 '19 at 22:18
  • I'm a student, haven't used __attribute__((packed)) or such yet. I'll look into it. This was something I was doing to further practice reading & writing files and the memory errors in Valgrind threw me off and were hard to narrow down. I'm trying to apply your memset suggestion but it's making it so I get an assignment to expression error while I'm trying to strdup a string to rec.text. Can't do a regular initialization since I had to split that up to insert the memtest. – lee8oi Mar 31 '19 at 22:27
  • Checking (file == NULL) has no bearing on the valgrind errors. – lee8oi Mar 31 '19 at 22:31
  • @lee8oi - In order to properly assign the string, use `strncpy(&rec.text, "Some testing", sizeof(rec.text))`. This will allow you to split declaration and assignment. Take care: `text` is an array of bytes, and there's some syntactic ambiguity between arrays and pointers which usually confuse novices. As you progress in learning C, you will understand :) – Dacav Mar 31 '19 at 22:34
  • And `strncpy()`is always a bad advice.(well: *almost* always) – wildplasser Mar 31 '19 at 22:39
  • @wildplasser compare `strncpy` vs `strcpy`. `strncpy` is good if you know what you are doing. Of course the null terminator can be left off, if there's no space. – Dacav Mar 31 '19 at 22:40
  • @Dacav still can't assign an expression to an array type. Can you show me an example of how that section should look with that strncpy command? fileio.c:20:14: error: assignment to expression with array type rec.text = strncpy(&rec.text, "Some testing", sizeof(rec.text)); – lee8oi Mar 31 '19 at 22:41
  • @lee8oi -- `struct record rec; strncpy(&rec.text, "some testing", sizeof(rec.text) - 1); `rec.text[sizeof(rec.text) - 1] = '\0';` But if you fix your error checking *and* declare a packed struct, your code is good. You can avoid `strncpy` – Dacav Mar 31 '19 at 22:43
  • Ah, I thought I had to return it to the rec.text as well. Duh. I guess I need to keep reading. – lee8oi Mar 31 '19 at 22:45
  • strncpy allows me to use the memset suggestion, but by itself doesn't take care of the errors. – lee8oi Mar 31 '19 at 22:49
  • This answer seems like it changed 100 times lol. But it's useful in its final form that I see here. I had a feeling there was some extra bytes being used that didn't add up to a full character (or something), when the array size wasn't 'even'. But I didn't quite connect the dots very well. Thank you for sticking through with me and working so hard on your answer. – lee8oi Mar 31 '19 at 23:49
  • @lee8oi - sure, no problem. I got somewhat confused on what caused the error, and I just made it up by fixing the answer and clearing the confusion I made. It's just a matter of responsibility :) – Dacav Apr 01 '19 at 06:49