0

My C code (using VS 2015) is failing to completely read files containing multiple 4 byte signed integers (int32), while a binary viewer program shows no issue with the data in the files (Image 1). I have tried several methods of reading the data files with similar results. My question is simply what is incorrect in the example codes below? If nothing is wrong with the code then what could be wrong with the data file?

I have provided a link to an example data file below if someone has the time and interest to examine it. In both code examples (below) the reading ceases at Integer number 78 which is = 26 according to the binary viewer.

Example Code 1:

typedef signed __int32 INT32;

FILE *fp = NULL;
INT32 k;
int i=0;

fp = fopen(myfilePath, "r");

while(!feof(fp))
{
    fread(&k,sizeof(INT32),1,fp);
    printf("a[%d] = %d\n",i,k);
    i++;
}
fclose(fp);

Example Code 2:

typedef signed __int32 INT32;

FILE *fp = NULL;
long sz=0;
INT32 k;
int i=0

fp = fopen(myfilePath, "r");
// find the size of the file
fseek(fp, 0L, SEEK_END);
sz = ftell(fp)/4;    // store the Int32 data count
rewind(fp);

for(i=0;i<sz;i++)
{
    fread(&k,sizeof(INT32),1,fp);
    printf("a[%d] = %d\n",i,k);
}
fclose(fp);

Binary Viewer Reads the entire file correctly and indicates (in yellow) where C stops reading the file

Link to example data file. Size: 3,572 bytes. Contains 893 Int32 values

Thank you for your assistance!

2 Answers2

1

You are slightly off-target on what you have in your input file. Your input file provides the number of 32-bit integers contained in the file as the first value. You need only read the first integer to know how much storage you need to allocate for the remaining values.

Rather then using a typedef to a signed 32-bit integer, the standard C library provides the stdint.h header with all exact-width types, including the signed 32-bit type already provided as int32_t. The inttypes.h header provides the macros for printing and reading the exact width types (e.g. PRId32 for printing where d can be u unsigned, x hexadecimal, o octal, or i integer, and e.g. SCNd32 for use with scanf)

Therefore, all you really need to do is open the file for reading (using "rb" as the mode for portabilty, the 'b' doesn't do anything and is provided for C89 compatibility) You open the file and read the first 32-bit value into a variable. That tells you how many 32-bit values follow -- and leaves the file position indicator ready to read the remaining values, e.g.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

int main (int argc, char **argv) {

    int32_t *a = NULL, nint = 0;  /* pointer and no. of int */
    /* use filename provided as 1st argument ("000002.dat" by default) */
    FILE *fp = fopen (argc > 1 ? argv[1] : "000002.dat", "rb");

    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    /* read no. of int32_t values from 1st value in file */
    if (fread (&nint, 1, sizeof nint, fp) != sizeof nint) {
        perror ("fread-nint");
        return 1;
    }
    ...

Now simply allocate storage for the remaining values, either with malloc() (recommended), or you can declare a Variable Length Array but you should check the number of values to ensure you don't attempt to declare an array that exceeds your stack size -- which will be compiler/OS dependent. MS usually provides a 1M stack, so you should be able to declare a VLA of about 200K integers -- balanced against whatever other stack use you have. A simple dynamic allocation with malloc will eliminate the risk of StackOverflow...

    ...
    /* allocate/validate storage */
    if (!(a = malloc (nint * sizeof nint))) {
        perror ("malloc-a");
        return 1;
    }
    ...

All that remains is reading the rest of the values from your file with fread. The fread function reads a number of blocks of a given size storing the results in the address provided. So you simply want to read nint values of sizeof nint (or you could use sizeof *a- both are the same type). The return will be the number of blocks of that size read from the file. You can read the remaining values with:

    ...
    /* read remaining values from file into a */
    if (fread (a, sizeof nint, (size_t)nint, fp) != (size_t)nint) {
        perror ("fread-a");
        return 1;
    }
    fclose (fp);    /* close file */
    ...

(note: always VALIDATE that your allocation succeeds, and validate your read from the file by checking the return of fread.)

A complete example that confirms the number of 32-bit values read from the file could be:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

int main (int argc, char **argv) {

    int32_t *a = NULL, nint = 0;  /* pointer and no. of int */
    /* use filename provided as 1st argument ("000002.dat" by default) */
    FILE *fp = fopen (argc > 1 ? argv[1] : "000002.dat", "rb");

    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    /* read no. of int32_t values from 1st value in file */
    if (fread (&nint, 1, sizeof nint, fp) != sizeof nint) {
        perror ("fread-nint");
        return 1;
    }
    /* allocate/validate storage */
    if (!(a = malloc (nint * sizeof nint))) {
        perror ("malloc-a");
        return 1;
    }
    /* read remaining values from file into a */
    if (fread (a, sizeof nint, (size_t)nint, fp) != (size_t)nint) {
        perror ("fread-a");
        return 1;
    }
    fclose (fp);    /* close file */

    /* report number of integers read */
    printf ("%d int32_t read from file.\n", nint);
}

Example Use/Output

Using the file you provided the link to, and passing the filename to read as the first argument to the program (or reading from the file in the current directory by default), you would get:

$ ./bin/freadint32_t ../dat/000002.dat
892 int32_t read from file.

Values In File

If you output the values in the file by adding a simple loop, you would find:

16  25  22  11  17  20  19  23  22  16
17  22  25  25  18  22  24  17  15  18
25  14  14  29  16  14  23  23  21  20
28  24  17  22  18  21  22  24  27  16
16  21  22  30  28  18  23  20  15  23
20  19  22  22  23  20  18  20  28  22
21  22  20  30  21  17  24  22  21  18
19  20  20  25  22  20  30  26  25  33
21  15  23  22  19  17  17  20  21  21
27  35  27  19  21  22  19  13  18  18
12  20  25  22  24  21  20  26  22  24
30  22  18  22  20  16  18  23  22  24
23  17  22  22  17  23  22  16  24  25
20  18  18  25  24  23  22  17  23  26
22  16  17  25  27  24  23  26  23  20
24  17  10  23  22  13  20  16  16  22
18  23  25  20  28  24  21  26  22  24
22  24  25  19  26  28  21  18  21  25
24  19  20  21  19  20  19  19  18  29
...
25  23  18  19  25  23  19  23  22  18
22  19  16  15  13  25  26  23  26  20
23  16  14  23  20  23  22  24  26  19
20  18

All 892 values read.

Look things over and let me know if you have further questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
0

fopen() http://www.cplusplus.com/reference/cstdio/fopen/

mode C string containing a file access mode. It can be: "r" read: Open file for input operations. The file must exist. "w" write: Create an empty file for output operations. If a file with the same name already exists, its contents are discarded and the file is treated as a new empty file. "a" append: Open file for output at the end of a file. Output operations always write data at the end of the file, expanding it. Repositioning operations (fseek, fsetpos, rewind) are ignored. The file is created if it does not exist. "r+" read/update: Open a file for update (both for input and output). The file must exist. "w+" write/update: Create an empty file and open it for update (both for input and output). If a file with the same name already exists its contents are discarded and the file is treated as a new empty file. "a+" append/update: Open a file for update (both for input and output) with all output operations writing data at the end of the file. Repositioning operations (fseek, fsetpos, rewind) affects the next input operations, but output operations move the position back to the end of file. The file is created if it does not exist. With the mode specifiers above the file is open as a text file. In order to open a file as a binary file, a "b" character has to be included in the mode string. This additional "b" character can either be appended at the end of the string (thus making the following compound modes: "rb", "wb", "ab", "r+b", "w+b", "a+b") or be inserted between the letter and the "+" sign for the mixed modes ("rb+", "wb+", "ab+").

You are opening and reading your file in text mode change the following line fp = fopen(myfilePath, "r"); into fp = fopen(myfilePath, "rb"); to open and read file content in binary mode.

SoLaR
  • 778
  • 7
  • 22