0

For a lab we are required to read in from binary files using low level io (open/lseek/close not fopen/fseek/fclose) and manipulate the data. My question is how do I read or write structs using these methods.

The struct is as follows

typedef struct Entry {
    char title[33];
    char artist[17];
    int  val;
    int  cost;
} Entry_T;

I originally planned on creating a buffer of sizeof(Entry_T) and read the struct simply, but I don't think that's possible using low level I/O. Am I supposed to create 4 buffers and fill them sequentially, use one buffer and reallocate it for the right sizes, or is it something else entirely. An example of writing would be helpful as well, but I think I may be able to figure it out after I see a read example.

CrypticStorm
  • 560
  • 2
  • 11
  • 2
    It's possible to read a whole struct like yours using a single `read` command. – egur Apr 28 '14 at 19:48
  • read into buffer then use pointer to access each element. Don't forget to cast the types if needed. or cast the buffer? – TWhite Apr 28 '14 at 19:50
  • 1
    Bulk-reading this using a single *read* is *very* platform-specific. How you read this *properly* is entirely dependent on how it was *written*. You have limited options, but if you control *both* ends and platform independence is a goal, it can be... tedious. – WhozCraig Apr 28 '14 at 19:51
  • Do you know the exact format of the file? Is it the natural `struct` format of the same architecture where it will run, including padding and/or endianness? Or are all fields in contiguous bytes? – aschepler Apr 28 '14 at 19:59

3 Answers3

5

Because your structures contain no pointers and all elements are fixed size, you can simply write and read the structures. Error checking omitted for brevity:

const char *filename = "...";
int fd = open(filename, O_RDWR|O_CREAT, 0644);
Entry_t ent1 = { "Spanish Train", "Chris De Burgh", 100, 30 };
ssize_t w_bytes = write(fd, &ent1, sizeof(ent1));
lseek(fd, 0L, SEEK_SET);
Entry_t ent2;
ssize_t r_bytes = read(fd, &ent2, sizeof(ent2));
assert(w_bytes == r_bytes);
assert(w_bytes == sizeof(ent1));
assert(strcmp(ent1.title, ent2.title) == 0);
assert(strcmp(ent1.artist, ent2.artist) == 0);
assert(ent1.val == ent2.val && ent1.cost == ent2.cost);
close(fd);

If your structures contain pointers or variable length members (flexible array members), you have to work harder.

Data written like this is not portable unless all the data is in strings. If you migrate the data between a big-endian and little-endian machine, one side will misinterpret what the other thinks it wrote. Similarly, there can be problems when moving data between 32-bit and 64-bit builds on a single machine architecture (if you have long data, for example, then the 32-bit system probably uses sizeof(long) == 4) but the 64-bit system probably uses sizeof(long) == 8 — unless you're on Windows.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Will read automatically malloc space or use the stack? My read function is an isolated method and must allocate the Entry to the heap. – CrypticStorm Apr 28 '14 at 20:04
  • 2
    +1, and I wish I could up-tick this again for the portability elaboration. As usual, very well written. @CrypticStorm if you're dyna-allocating this for a return result, you could (a) make the *caller* hand *you* a valid address to one of these and put the onus on *them* to figure out where the memory comes from, or (b) do this first, and only *after* you know it succeeded (good retval from your `read` call), allocate using `malloc()` and copy the local structure to the dynamic one, then return it as a result which they `free()`, else NULL. Personally, I prefer the former of these options. – WhozCraig Apr 28 '14 at 20:05
  • 1
    `read()` does not automatically allocate space. You'll need to allocate the structure (e.g. by creating a local variable of the right type). If you use `malloc()`, you'll also need to know where you're going to do `free()` or you'll leak memory. If you have variable length fields, you'll probably end up using `malloc()` and `free()` too. – Jonathan Leffler Apr 28 '14 at 20:06
  • Don't forget about [Padding](http://stackoverflow.com/a/4306269/2705293). If all of the data is written/read using structs, you'll be fine, but if each member is written individually, then a single read using a struct might be off. To avoid this using GCC use `__attribute__((__packed__))`. – jmstoker Apr 28 '14 at 20:18
  • 1
    @stoker: I would far rather not use packed and would insist that there's only one way to read or write the structures correctly, using whole structure reads and writes. Anything else is a bug in the code accessing the data. – Jonathan Leffler Apr 28 '14 at 20:22
  • 1
    +1 for this answer checks the result of `read()` - an important step. – chux - Reinstate Monica Apr 28 '14 at 20:23
  • @JonathanLeffler I agree. – jmstoker Apr 28 '14 at 21:54
1

The low-level functions might be OS specific. However they are generally these:

fopen() ->  open()
fread() ->  read()
fwrite() -> write()
fclose() -> close()

Note that while the 'fopen()' set of functions use a 'FILE *' as a token to represent the file, the 'open()' set of functions use an integer (int).

Using read() and write(), you may write entire structures. So, for:

typedef struct Entry {
    char title[33];
    char artist[17];
    int  val;
    int  cost;
} Entry_T;

You may elect to read, or write as follows:

{
int fd = (-1);
Entry_T entry;
...

fd=open(...);
...

read(fd, &entry, sizeof(entry));
...

write(fd, &entry, sizeof(entry));
...

if((-1) != fd)
   close(fd);

}
Mahonri Moriancumer
  • 5,993
  • 2
  • 18
  • 28
0

Here you go:

Entry_T t;
int fd = open("file_name", _O_RDONLY, _S_IREAD);
if (fd == -1) //error
{
} 

read(fd, &t, sizeof(t));
close(fd);
luk32
  • 15,812
  • 38
  • 62
egur
  • 7,830
  • 2
  • 27
  • 47
  • I think you should also add the calls necessary to save an `Entry_T` using `write`. – R Sahu Apr 28 '14 at 19:58
  • This will put Entry_T on the stack correct? I would have to allocate it to the heap if I need to afterwards. That requires strcpy() IIRC. – CrypticStorm Apr 28 '14 at 19:58
  • 1
    @CrypticStorm: You can use structure assignment with your structure (because everything is fixed size and no pointers). So you could use `Entry_T *p = malloc(sizeof(*p)); *p = t; return p;` using the code from this answer. Or do the allocation and then `read(fd, p, sizeof(*p));` Error checking omitted as usual. – Jonathan Leffler Apr 28 '14 at 20:11