0

I'm having trouble reading from a binary file into a struct given a specific format. As in, given a byte-by-byte definition (Offset: 0 Length: 2 Value: 0xFF 0xD8, Offset: 2 Length: 2 Value: 0xFF 0xE1, etc), I'm not sure how to define my structure nor utilize file operations like fread(); to correctly take in the information I'm looking for.

Currently, my struct(s) are as follows:

struct header{
    char start; //0xFF 0xD8 required.
    char app1_marker; //0xFF 0xE1 required. Make sure app0 marker (0xFF 0xE0) isn't before.
    char size_of_app1_block; //big endian
    char exif_string; //"EXIF" required
    char NULL_bytes; //0x00 0x00 required
    char endianness; //II or MM (if not II break file)
    char version_number; //42 constant
    char offset; //4 blank bytes
};

and

struct tag{
    char tag_identifier;
    char data_type;
    char size_of_data;
    char data;
};

What data types should I be using if each attribute of the structure has a different (odd) byte length? Some require 2 bytes of space, others 4, and even others are variable/dynamic length. I was thinking of using char arrays, since a char is always one byte in C. Is this a good idea?

Also, what would be the proper way to use fread if I'm trying to read the whole structure in at once? Would I leave it at:

fread(&struct_type, sizeof(char), num_of_bytes, FILE*);

Any guidance to help me move past this wall would be greatly appreciate. I already understand basic structural declerations and constructions, the two key issues I'm having is the variance and odd byte sizes for the information and the proper way to read variable byte sizes into the struct in one fread statement.

Here is the project link: http://people.cs.pitt.edu/~jmisurda/teaching/cs449/2141/cs0449-2141-project1.htm

Fiki
  • 62
  • 2
  • 9
  • 1
    You also need to be careful about compiler byte alignment - read this - http://en.wikipedia.org/wiki/Data_structure_alignment – OldProgrammer Sep 26 '13 at 23:28
  • 1
    The comments in your code don't seem to match what you've declared. For example, the comment on `exif_string` says _"EXIF" required_ but you declared it as a single char, which clearly can't contain "EXIF". The same applies to most other elements in your struct. The elements of the struct must match the sizes of the items you're expecting to read into the struct. – Carey Gregory Sep 27 '13 at 00:32
  • http://people.cs.pitt.edu/~jmisurda/teaching/cs449/2141/cs0449-2141-project1.htm Above is the link to the actual project. I've already done the first (unrelated) part. I handle the endianness via the descriptive field in the tag. – Fiki Sep 27 '13 at 00:56
  • 1
    Your structure doesn't follow the specifications (as Carey notes). Use the provided information: the `Length` in bytes in the tables. Note that you need to take care of [Structure padding](http://stackoverflow.com/questions/4306186/structure-padding-and-structure-packing), either by packing your structures or by reading items one at a time. – Jongware Sep 27 '13 at 01:04

1 Answers1

2

I usually see fread() on structures specify the size as the structure size, and the number of elements as 1, and the expectation that fread() returns 1:

size_t result = fread(&record, num_of_bytes, 1, infile);

I am uncertain how you figure out if your endianess field is II or MM, but I guess the idea is that you could decide whether or not to fix up the field values based on whether the file endianess matches the host endianess.

The actual data seems to be the tag structures, and the last field data is actually just a place holder for variable length data that is specified in the size_of_data field. So I guess you would first read sizeof(struct tag) - 1 bytes, and then read size_of_data more bytes.

struct tag taghdr;
size_t result = fread(&taghdr, sizeof(taghdr) - 1, 1, infile);
if (result != 1) { /* ...handle error */ }
struct tag *tagdata = malloc(sizeof(taghdr) + taghdr.size_of_data - 1);
if (tagdata == 0) { /*...no more memory */ }
memcpy(tagdata, &taghdr, sizeof(taghdr) - 1);
if (taghdr.size_of_data > 0) {
    result = fread(&tagdata->data, taghdr.size_of_data, 1, infile);
    if (result != 1) { /*...handle error */ }
}
jxh
  • 69,070
  • 8
  • 110
  • 193
  • I have a field that tells me II or MM, and discard the non Intel format - return by printf the descriptive error "The file is not in an acceptable format." I'm reading over your post multiple times now to make sure I'm understanding it all. I'm going to link you to my project page as well, so there is a way to view the organized header/tags. Thanks for the help so far. – Fiki Sep 27 '13 at 00:54
  • 1
    My comment about `endianess` was specific to that field, but I had the same concern that @CareyGregory voiced. Basically, it seems like it should be `char endianess[2];`, not `char endianess;`. The other possibility is the `II` and `MM` are special codes to mean a specific byte value. Since it wasn't clear, I voiced a concern. – jxh Sep 27 '13 at 01:06
  • I hope I didn't come off as being ungrateful! I am extremely thankful for your help! I just wanted to let you know that I had the endian concern figured out. You alluded to using char endianess[2] - could I use this for every field and just allocate the char array size based on what I needed? I wasn't sure what to use for data types either. – Fiki Sep 27 '13 at 01:15
  • You should include the data format definitions from the link you provided in your problem description itself. Part of your question is how to define the structures to correspond with the data format definitions. This isn't clear from your original question. – jxh Sep 27 '13 at 01:21