4

I have a file and using C I want to read the contents of it using fread() (from stdio.h) and write it into the members of a struct. (In my case there is a 2 byte int at the start followed by a 4 byte int.) But after writing the contents of the file correctly into the first two byte variable of the struct, it skips two bytes before continuing with the second four byte variable.

To demonstrate, I have created a 16 byte file to read from. In Hex it looks like this (Little-endian): 22 11 66 55 44 33 11 11 00 00 00 00 00 00 00 00

With the following code I expect the first variable, twobytes, to be 0x1122 and the second, fourbytes, to be 0x33445566. But instead it prints:

twobytes: 0x1122 
fourbytes: 0x11113344

sizeof(FOO) = 8
&foo     : 0061FF14
&foo.two : 0061FF14
&foo.four: 0061FF18

Skipping bytes 3 and 4 (0x66 & 0x55). Code:

#include <stdio.h>
#include <stdint.h>

int main(void) {

    FILE* file = fopen("216543110.txt", "r");
    if (file==NULL) { return 1; }

    typedef struct
    {
        uint16_t twobytes;
        uint32_t fourbytes;
    }__attribute__((__packed__)) // removing this attribute or just the underscores around packed does not change the outcome
    FOO;
    
    FOO foo;
    
    fread(&foo, sizeof(FOO), 1, file);
    
    printf("twobytes: 0x%x \n", foo.twobytes);
    printf("fourbytes: 0x%x \n\n", foo.fourbytes);

    printf("sizeof(FOO) = %d\n", sizeof(FOO));
    printf("&foo     : %p\n", &foo);
    printf("&foo.two : %p\n", &foo.twobytes);
    printf("&foo.four: %p\n", &foo.fourbytes);
    
    fclose(file);
    return 0;
}

Using a struct with two same size integers works as expected.


So: Using fread() to write into different size variables causes skipping bytes:

22 11 .. .. 44 33 11 11 ...

instead of

22 11 66 55 44 33 ...


I am aware that something about byte alignment is playing a role here, but how does that affect the reading of bytes? If C wants to add padding to the structs, how does that affect the reading from a file? I don't care if C is storing the struct members as 22 11 .. .. 66 55 44 33 ... or 22 11 66 55 44 33 ..., I'm confused about why it fails to read my file correctly.

Also, I am using gcc version 6.3.0 (MinGW.org GCC-6.3.0-1)

  • 2
    if sizeof struct is 8 bytes then it reads 8 bytes. – stark Jul 23 '20 at 11:48
  • You should check/print the value of `sizeof(FOO)`, which appears to be 8 bytes in your code, with the two 'missing' bytes added as padding between the two structure members. I think you need to apply the `__attribute__((__packed__))` to **individual members** of the structure. – Adrian Mole Jul 23 '20 at 11:54
  • @Adrian Mole, According to [this](https://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Type-Attributes.html), it can be used on the struct too. But it shows it before the curlies. Of course, that could be for a different compiler or compiler version than the OP is using – ikegami Jul 23 '20 at 12:05
  • 4
    @Paul Tashkent, Since you're asking about a non-standard compiler extension, you should mention the compiler and its version – ikegami Jul 23 '20 at 12:08
  • Please add the following lines to your program and tell us the output: 1. `printf( "%d\n", sizeof(FOO) );` 2. `printf( "%p\n", &foo );` 3. `printf( "%p\n", &foo.twobytes );` 4. `printf( "%p\n", &foo.fourbytes );` – Andreas Wenzel Jul 23 '20 at 12:15
  • Try `__attribute__((packed))` instead of `__attribute__((__packed__))` and I'm surprised you don't get a warning about an unknown attribute. – user253751 Jul 23 '20 at 12:17
  • When I [run the code on OnlineGDB](https://onlinegdb.com/rycSfWPxD), I get `sizeof(FOO) == 6` with `__attribute__((__packed__))` and `sizeof(FOO) == 8` without it. OnlineGDB uses the gcc compiler. – Andreas Wenzel Jul 23 '20 at 12:27
  • I think 'twobytes' variable might be aligned to 4 bytes. Then when you try to read 8 bytes in total - 2 'lost' bytes are in 2 bytes between 'twobytes' and 'fourbytes' variable. Can you print all 8 bytes separately in that struct ? – Marcin Jul 23 '20 at 13:22
  • @Marcin: Yes, that alignment is the default behavior. The point of `__attribute__((__packed__))` is to override this default behavior. – Andreas Wenzel Jul 23 '20 at 13:26
  • @AndreasWenzel Added the Information you requested. `sizeof(FOO)` is indeed `8`. But my question of why it "skips" those bytes while *reading* does not feel answered yet. @ikegami Added my compiler version as well. – Paul Tashkent Jul 23 '20 at 19:33
  • 3
    @PaulTashkent, there is no reason to think that `fread()` skips any bytes. The issue is *where it puts* the bytes it reads, and how that relates to the `twobytes` and `fourbytes` members of your `struct`. If you don't see the expected bytes in either of those members, then the only likely answer is that they go into padding bytes in the struct between those two members. Why your compiler accepts `__attribute__((__packed__))` yet still lays out the structure with padding is a different question. – John Bollinger Jul 23 '20 at 20:20
  • Now that you are now also printing `&foo.fourbytes`, you can see that this address is 4 bytes apart from `&foo.twobytes`, so there must be two bytes of padding. This is very useful information. – Andreas Wenzel Jul 23 '20 at 20:33
  • @JohnBollinger So basically all boils down to my compiler being weird...? – Paul Tashkent Jul 23 '20 at 21:06
  • @PaulTashkent, there *is* some kind of compiler weirdness going on, but please don't take it the wrong way that I attribute the problem to relying on language extensions in the first place. – John Bollinger Jul 23 '20 at 21:08
  • @PaulTashkent: As pointed out in the answer, you may also want to try to change the position of `__attribute__ ((__packed__))`. – Andreas Wenzel Jul 23 '20 at 21:08
  • @AndreasWenzel I did try to reposition the attribute where ever I could. For example even sepereting `typedef` from the `struct` decleration itself didn't help either – Paul Tashkent Jul 23 '20 at 21:11
  • Hmmm, I [tested it with gcc version 6.3 on godbolt](https://godbolt.org/z/7ecT7o), and it works as it should. Are you using any special compiler options? – Andreas Wenzel Jul 23 '20 at 21:20
  • @AndreasWenzel Not that I'd know. I've set everything up a few years ago without using it until now. I may try reinstalling everything C related and actually look into what I'm doing over the next few days... – Paul Tashkent Jul 23 '20 at 21:47
  • @PaulTashkent: I don't think that the problem is your compiler being old, because even if I set the compiler to gcc version 5.1 on godbolt (link see above), it still works. – Andreas Wenzel Jul 23 '20 at 21:49
  • @PaulTashkent: do you have options such as `-ansi` or `-std=c99`? – chqrlie Jul 23 '20 at 21:56
  • @PaulTashkent: I can reproduce your problem, but it takes a `"-D__attribute__(x)="` command line option, which would be a very vicious thing to hide in a Makefile – chqrlie Jul 23 '20 at 22:05
  • @AndreasWenzel @chqrlie The console in my IDE says: `gcc -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"src/fread.d" -MT"src/fread.o" -o "src/fread.o" "../src/fread.c"` – Paul Tashkent Jul 23 '20 at 22:08
  • [This answer](https://stackoverflow.com/a/31622291/12149471) sounds interesting. Especially #2 of the answer. Also, [this answer](https://stackoverflow.com/a/24016239/12149471) might be interesting, recommends using the `-mno-ms-bitfields` command-line option in the compiler. In a comment, someone also recommends using [`gcc_struct`](https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Variable-Attributes.html). – Andreas Wenzel Jul 23 '20 at 23:23
  • 1
    [This answer](https://stackoverflow.com/a/37199319/12149471) is also interesting, which recommend using [`#pragma pack`](https://msdn.microsoft.com/library/2e70t5y1) instead. – Andreas Wenzel Jul 23 '20 at 23:48
  • @AndreasWenzel Yes! `#pragma pack()` works for me. Thank you for finding and posting that answer! – Paul Tashkent Jul 24 '20 at 14:25

3 Answers3

3

From the output your program produces, it seems the compiler ignores the __attribute__(__packed__) specification.

The gcc online user's guide documents the __attribute__ ((__packed__)) type attribute with an example where this attribute is placed before the { of the definition.

This extension is non standard so it is possible that different compilers or different versions of any given compiler handle it differently depending on the placement choice. If you use gcc, moving the attribute should fix the problem. If you use a different compiler, look at the documentation to figure what it does differently.

Also note these remarks:

  • the file should be opened in binary mode, with "rb",
  • the sizeof(FOO) argument should be cast as (int) for the %d conversion specifier.
  • pointer arguments for %p should be cast as (void *).
  • foo.twobytes has the same address as foo, which is mandated by the C Standard and &foo.fourbytes is located 4 bytes away, which means foo.fourbytes is aligned and there are 2 padding bytes between the 2 members.

Try modifying your code this way:

#include <stdio.h>
#include <stdint.h>

int main(void) {
    FILE *file = fopen("216543110.txt", "rb");
    if (file == NULL) {
        return 1;
    }

    typedef struct __attribute__((__packed__)) {
        uint16_t twobytes;
        uint32_t fourbytes;
    } FOO;
    
    FOO foo;
    
    if (fread(&foo, sizeof(FOO), 1, file) == 1) {
        printf("twobytes : 0x%x\n", foo.twobytes);
        printf("fourbytes: 0x%x\n\n", foo.fourbytes);

        printf("sizeof(FOO) = %d\n", (int)sizeof(FOO));
        printf("&foo     : %p\n", (void *)&foo);
        printf("&foo.two : %p\n", (void *)&foo.twobytes);
        printf("&foo.four: %p\n", (void *)&foo.fourbytes);
    }
    fclose(file);
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Although your answer is good and I have upvoted it, I would like to point out that the main problem was something different: The `__attribute__((__packed__))` has no effect on structs with `__attribute__((ms_struct))`, which is the default when targetting the Microsoft Windows x86 platform. See my answer for more information. – Andreas Wenzel Jul 24 '20 at 14:53
  • @AndreasWenzel: excellent point! Sorry I missed your latest comments... So much time has been wasted for decades dealing with the quirks and pitfalls of MS legacy platforms... More money gone down the drain than even the market value of this company(!) – chqrlie Jul 24 '20 at 17:17
2

On GCC, when targetting x86 platforms, the

__attribute__((__packed__))

only works on structs with

__attribute__((gcc_struct)).

However, when targetting Microsoft Windows platforms, the default attribute for structs is

__attribute__((ms_struct)).

Therefore, I see three ways to accomplish what you want:

  1. Use the compiler command-line option -mno-ms-bitfields to make all structs default to __attribute__((gcc_struct)).
  2. Explicitly use __attribute__((gcc_struct)) on your struct.
  3. Use #pragma pack instead of __attribute__((__packed__)).

Also, as pointed out in the answer by @chqrlie, there are some other things not ideal in your code. Especially when reading binary data, you should normally open the file in binary mode and not text mode, unless you know what you are doing (which you possibly are, since the file has a .txt extension).

Andreas Wenzel
  • 22,760
  • 4
  • 24
  • 39
1

Since the data structure in memory is different from one in file, It may be better to read the members of struct one by one. For example, there are a way to specify position to read the members of struct with "offsetof". The following reads the members of struct with the fread_members function.

#include <stdio.h>
#include <stdint.h>
#include <stddef.h> /* offsetof */

/* offset and size of each member */
typedef struct {
    size_t offset;
    size_t size;
} MEMBER;

#define MEMBER_ELM(type, member) {offsetof(type, member), sizeof(((type*)NULL)->member)}

size_t fread_members(void *ptr, MEMBER *members, FILE *stream) {
    char *top = (char *)ptr;
    size_t rs = 0;
    int i;
    for(i = 0; members[i].size > 0; i++){
        rs += fread(top + members[i].offset, 1, members[i].size, stream);
    }
    return rs;
}

int main(void) {

    FILE* file = fopen("216543110.txt", "r");
    if (file==NULL) { return 1; }

    typedef struct
    {
        uint16_t twobytes;
        uint32_t fourbytes;
    } FOO;

    MEMBER members[] = {
        MEMBER_ELM(FOO, twobytes),
        MEMBER_ELM(FOO, fourbytes),
        {0, 0} /* terminated */
    };

    FOO foo;

    fread_members(&foo, members, file);

    :
etsuhisa
  • 1,698
  • 1
  • 5
  • 7
  • Yes, this is a good workaround, so I have upvoted it. However, it still only is a workaround, not a solution to the underlying problem. See my answer for the solution to the underlying problem. – Andreas Wenzel Jul 24 '20 at 14:37
  • Your answer is direct, certain, easy and great to this question. However, if the data structure and its struct is not closed in a function, the alignment problem have to solve somewhere in the program, copying to another struct/array instance, porting to other platforms, accessing members without SIGBUS, etc. – etsuhisa Jul 25 '20 at 06:21