3

I have a binary file and I will be using fread to read the data from this binary file into an array of structures.

However, I don't know what value to pass to fread as its second argument. I know the file size is 536870912 bits. The binary file was constructed on the basis of being accessed for a 512^3 array. This means each data entry is of type float in the binary file with 4 bytes specified for each data element.

I made an error with the mention of bits. I read what was outputted by a C program finding the size of the file - it outputted 536870912 bits! Apologies to anyone confused.

Here is the code i'm using to read the data from the binary file into my arrary of structures (a simplified structure - there are 10 other parameters!)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

// Define the model structure
struct model {
        float density;
};


// Entry point for the program
int main () {
    int counter;
    long lSize;
    char * buffer;
    size_t result;
    FILE *pFile;
    int i,j,k,ibox;         /* Loop indices for the physical grid */

    struct model ***mymodel;

    pFile = fopen("core1_dens_0107.bin","rb");
    if (pFile == NULL) { printf("Unable to open density file!"); exit(1); }

    // obtain file size:
    fseek (pFile , 0 , SEEK_END);
    lSize = ftell (pFile);
    rewind (pFile);

    printf( "File size : %lu Bits \n", lSize );

    for ( j = 0 ; j < 512 ; j++ ) {
        for ( k = 0; k < 512; k++ ) {
            for ( i = 0; i < 512; i++ ) {
          fread(&mymodel[i][j][k].density,4,1,pFile);
              printf("%f \n",mymodel[i][j][k].density);
            }
        }
    }               

    fclose(pFile);
    return 0;
}
stars83clouds
  • 795
  • 1
  • 8
  • 25
  • How big is each record? – SheetJS Jun 28 '13 at 00:01
  • 2
    `536870912 bits / 32` ??? Or did the OP mean bytes? On second thought: I think he is confusing bits and bytes. (maybe he needs a _byte_ of guidance ;-) – wildplasser Jun 28 '13 at 00:02
  • Nirk - I guess the size of each record is 4 bits (a float). There are 134217728 records in the file and the file-size if as specified above in bits. wildplasser - I meant bits! Why are you dividing by 32? – stars83clouds Jun 28 '13 at 00:07
  • I've never heard of a 4-bit float. It could only hold 16 possible values. – David Schwartz Jun 28 '13 at 00:07
  • @stars83clouds IEEE floats are typically 4 bytes, not bits... – Sergey L. Jun 28 '13 at 00:08
  • I think @stars83clouds is repeatedly confusing bits and bytes. – Barmar Jun 28 '13 at 00:09
  • Apologies, I got my bits and my bytes mixed up. Itching to get this fread to work! – stars83clouds Jun 28 '13 at 00:10
  • @DS: that would be 1+1 bit sign + 1 bit mantissa, + 1bit exponent. All unsigned? But how to encode the Nans ? – wildplasser Jun 28 '13 at 00:13
  • wildplasser and David, I hope you are clear that it was bytes and not bits, an error on my part - thanks for your helpful comments however! – stars83clouds Jun 28 '13 at 00:29
  • You are using the indeces i,j,k in the wrong order, you can check in my answer – Antonio Jun 28 '13 at 01:21
  • Yes, the order did look odd - although it doesn't stop my segmentation fault happening. – stars83clouds Jun 28 '13 at 01:28
  • More important: you are not allocating memory for mymodel! Define it as `mymodel[MY_DIM][MY_DIM][MY_DIM];`. Just to be sure: are you going to add other fields to your struct? Because otherwise you could directly use the buffer. – Antonio Jun 28 '13 at 01:34
  • Your question still says the file size is "536870912 bits", but if it holds 512**3 4-byte `float` values, then it should be exactly 536870912 *bytes*. Please update the question with the correct size. (No need to mention your previous error; just fix it, and you can delete the explanation of the bits/bytes confusion. If anyone is interested, they can look at the edit history.) – Keith Thompson Jun 28 '13 at 01:37
  • You might find [this question](http://stackoverflow.com/q/8589425/827263) about `fread`, and [my answer](http://stackoverflow.com/a/8589688/827263), useful. And don't forget to check the result returned by `fread`. – Keith Thompson Jun 28 '13 at 01:39
  • This is only test code Antonio, the actual model structure is allocated memory in another C program which calls this one. If I run this as a standalone program, must i allocate the memory beforehand? – stars83clouds Jun 28 '13 at 01:40
  • Keith, I read that question before I posted here - i will check your answer now. – stars83clouds Jun 28 '13 at 01:42
  • Hi Keith, how do you check the result returned by fread?? – stars83clouds Jun 28 '13 at 01:59
  • @stars83clouds If you run this as a standalone program then you have to allocate memory for your matrix. Easiest way to do so is: `mymodel = malloc(lSize)`. If you do not allocate memory you will get a segmentation fault. `fread` will return the amount of blocks actually read (which can differ from the amount you requested). Double-check the return value that it matches your third argument and and if not check `feof()` for end of file and `ferror()` for errors. – Sergey L. Jun 28 '13 at 11:45

3 Answers3

1

Supposing you have already opened the file and you have your file descriptor myStream, it should be as simple as this:

#define MY_DIM = 512; ///Maybe you want to play safe and make it a little bit larger? Up to you

float buffer[MY_DIM][MY_DIM][MY_DIM];

size_t readBytes;

int i,j,k;
for (k = 0; k < MY_DIM; k++)
  for (j = 0; j < MY_DIM; j++) {
      readBytes = fread((void*) (buffer[k][j]), sizeof float, MY_DIM, myStream); //I am not sure the (void*) conversion is necessary
      if (readBytes < MY_DIM) //I unexpectedly reached the end of the file,
        goto endOfTheLoop;    //without reading all the data I needed for int
                              //You could also print a warning message
      }      

endOfTheLoop:

//Now close the input file, use fclose or something

//Now that you have read all the data, you have to put it in your array of struct:
for (k = 0; k < MY_DIM; k++)
  for (j = 0; j < MY_DIM; j++)
    for (i = 0; i < MY_DIM; i++)
      mymodel[k][j][i].density = buffer[k][j][i]; 
Antonio
  • 19,451
  • 13
  • 99
  • 197
  • The array of structures is 3-dimensional Antonio, so I have included a nested for-loop that toggles between 0 and 512 along each dimension reading the data from the binary file as it does so. I am now sure that each data element in the binary file is 4 bytes in length and is of type float (obviously). I'm just unsure as to how to why the code terminates in a segmentation fault. Thanks for your code above, very readable. – stars83clouds Jun 28 '13 at 00:32
  • @star83clouds Please check my edit. (Too see the old version, you can always check in the history, clicked on "edited some time ago"). If you have a segfault, add some printf to quickly see where the code arrives before failing – Antonio Jun 28 '13 at 01:14
  • Thanks a bunch Antonio, i'll run this code on my system with my file. One thing though, does it matter about the order of the subscripts in the nested fors? I mean if you look at my code above, I have i-k-j in that order moving outwards and you have k-j-i. Plus, would you be better reading in 512*512 float values at a time above? – stars83clouds Jun 28 '13 at 01:23
  • @star83clouds There was an error, sorry, I am too tired... Please check carefully. The order of index doesn't really matter, as long as within the same for loop you are consistent. Yes, you could also read with bigger blocks, you are free to do so: in fact, you could put the whole content of the file in the buffer with one line. Note: when you are satisfied with one of the answer here (you have to chose just one though), you can accept it clicking on the symbol under the score of the answer. – Antonio Jun 28 '13 at 01:29
0

You can pass whatever value of the 2nd argument is most convenient for your program. If you want to process the file one structure at a time, do:

nread = fread(&your_struct, 1, sizeof yourstruct, stream);

If you have an array of structures, e.g.

struct foo your_struct[STRUCT_COUNT];

you can do:

nread = fread(your_struct, STRUCT_COUNT, sizeof *your_struct, stream);
Barmar
  • 741,623
  • 53
  • 500
  • 612
-1

size_t fread(void *ptr, size_t size, size_t nmemb, FILE * stream );

will attempt to read nmemb blocks of size bytes each. It will guarantee that no partial blocks are read. If your blocks are 4 bit long then I suggest you read them byte by byte, otherwise use the size argument to specify the block size.

For instance

fread(buffer, 1, 1024, stdin);

will attempt to read 1024 bytes, but may stop at any point.

fread(buffer, 4, 256, stdin);

will attempt to also read 1024 bytes, but in blocks of 4 bytes. 256 blocks total. It will guarantee that no partial blocks are read.

fread(buffer, 1024, 1, stdin);

will attempt to read one block of 1024 bytes. If it can not - nothing will be read.

If you wish to read in the entire file then you can do it in blocks of 4 via:

size_t read, read_now;
while (read < filesize && (read_now= fread(buffer +read, 4, (filesize - read) >> 2, in)) != EOF)
    read += read_now;

of you can attempt to read the whole thing in one go:

fread(buffer, filesize, 1, in);
Sergey L.
  • 21,822
  • 5
  • 49
  • 75
  • @Barmar That's why I wrote "attempt". It will return 1024 bytes or leave whatever is in the buffer and act accordingly to possible EOF conditions. But it will guarantee that no partial block is read! – Sergey L. Jun 28 '13 at 00:15
  • I deleted my comment, I was thinking of `read()`, not `fread()`. – Barmar Jun 28 '13 at 00:16
  • Sergey L - I know how fread works, it's more the layout of the binary file that i'm struggling with. Not to mention the discrepancy between bits and bytes! Joking. I'm now aware that if it contains 4 bytes for each float data element, there should be 512^3 data elements in the file. Does that mean that my second fread argument should be 4? I tried using 4 and the code i'm running terminated with a segmentation fault. – stars83clouds Jun 28 '13 at 00:21
  • @stars83clouds if your second argument is 4 then you need to divide your file size by 4 for the third argument. And make sure you allocate enough memory to hold all that. `fread(buffer, 4, 134217728, in);`. Make sure you allocate bytes for the `buffer`. – Sergey L. Jun 28 '13 at 00:24
  • If I have a nested for-loop with the fread statement inside with 3 nested fors, then the 3rd argument to the fread can be 1? The data is being read into a 3d grid having 512 cells along each dimension. I toggle between 0 and 512 for each dimension whilst inside the nested for-loop I am reading in the data. – stars83clouds Jun 28 '13 at 00:28
  • @stars83clouds Then you are reading one block at a time. Yes, then the third argument should be 1. But make sure your read is successful - it can fail. – Sergey L. Jun 28 '13 at 00:30
  • That's my problem Sergey, i'm sure now of the size of each block, but the code fails with a segmentation fault when I allow the 2nd argument to be 4 and the 3rd to be 1. It makes sense logically but then that's programming, isn't it....:) – stars83clouds Jun 28 '13 at 00:34
  • @stars83clouds It would help if you'd actually post your code, because beyond this at least a dozen other things could go wrong. – Sergey L. Jun 28 '13 at 00:37
  • Sergey, please find my code attached to the question above. The bit of code to find the size of the file works, but the bit afterwards ends in a seg fault. – stars83clouds Jun 28 '13 at 00:47
  • @stars83clouds judging by your code above your segmentation fault is due to lack of memory allocation. – Sergey L. Jun 28 '13 at 11:46