4
struct book
{
    unsigned short  size_of_content;
    unsigned short  price;
    unsigned char  *content;
};

Assume I have file that contains multiple books, each has different size_of_content, price and content. How can I read them one book at a time and identify which book it is (check price, for example)?

size_t nread2;
struct book *buff = malloc(sizeof(struct book));
while( (nread2 = fread(buff, sizeof(struct book), 1, infp)) > 0 )
{
    printf("read a struct once \n");
}

This is what I have so far. I tried to print whenever I read a struct. However, when I tried a input file with 5 structs, it will print 15 times...

Thanks.

ChrisF
  • 134,786
  • 31
  • 255
  • 325
Pig
  • 2,002
  • 5
  • 26
  • 42

1 Answers1

7

Let's look at your struct and think about how big it is.

struct book {
    unsigned short  size_of_content;
    unsigned short  price;
    unsigned char  *content;
};

The first item is an unsigned short, no problem, sizeof(unsigned short) probably will be 2, as in 2 bytes. Likewise the next one.

But that third one. That's a pointer to unsigned char... your disk records are not likely saved pointers. You have a field size_of_content... my guess is that the disk records contain the size_of_content, then the price, and then the actual content.

I won't write the complete code for you, but in pseudocode it goes something like this:

fread(&size_of_content, sizeof(size_of_content), 1, infp)
sanity-check the value of size_of_content and handle any error
fread(&price, sizeof(price), 1, infp)
sanity-check teh value of price and handle any error
buff->content = malloc(size_of_content)
check for error on malloc and handle any error
fread(buff->content, size_of_content, 1, infp)

If you don't have a hard-and-fast spec for how big the content can be, just assume it can't be more than a billion or something like that, and make sure the number at least isn't that huge! Always check for errors.

Since there are only two fields in the struct, it's pretty easy to just fread() each one. If you had a more complex structure, it might be worth it to use a struct:

struct book_header {
    unsigned short  size_of_content;
    unsigned short  price;
};

struct book {
    struct book_header header;
    unsigned char *content;
}

Then you can use fread() with sizeof(book_header) to read the whole header in one go. I've written a lot of code like this when working with binary files like wave audio files.


You probably don't need to worry about this, but it would be a problem if the file was written on a "big-endian" computer and read on a "little-endian" computer, or vice-versa.

http://en.wikipedia.org/wiki/Endianness

If you did have that problem, the solution is to standardize. Pick either one (little-endian or big-endian) and use a C library function to make sure the numbers are written and read using that endian-ness. For example, the htonl() library function when writing, and ntohl() when reading.

http://linux.die.net/man/3/htonl

But as I said, you probably don't need to worry about this.

steveha
  • 74,789
  • 21
  • 92
  • 117
  • This is just a arbitrary question, actual size of content would not be very large? – Pig Feb 12 '14 at 07:59
  • I have no way to guess how large the actual size of content might be, but I would not expect it to be large. You can look at the size of the input file, and you know that must be five records, so they are probably about 1/5 of the file size or something like that. – steveha Feb 12 '14 at 08:02
  • Thanks a lot man! One thing I am still not sure is that since there are multiple books in the file, how can fread know which one to look for? – Pig Feb 12 '14 at 09:02
  • sizeof(unsigned short) will probably still be 2 bytes regardless of 32 or 64-bit systems since most systems now use LP64 or LLP64 model, where int and below don't change their size – phuclv Feb 12 '14 at 09:10
  • @LưuVĩnhPhúc, thank you for the correction. I was tired and somehow didn't notice that it was a `short`. Corrected, and now I go to sleep. – steveha Feb 12 '14 at 09:23
  • @LesbianSquirtle, if the file is just a series of records, there is no way to know which one to look for. You would just have to read each one in turn until you found the one you need. That's why databases have indices. You could add an index but then the file format would be more complex... or you could just use SQLite if you need a real database with real database features like indices. – steveha Feb 12 '14 at 09:25