3

Let's say I have a file with a size of 5000 bytes, which I am trying to read from.

I have this code:

int main()
{
  char *file_path[] = "/path/to/my/file"
  FILE *fp= fopen(file_path,"rb"); //open the file
  fseek(fp, 0, SEEK_END); // seek to end of file
  unsigned long fullsize = ftell(fp); //get the file size (5000 for this example)
  fseek(fp, 0, SEEK_SET); //bring back the stream to the begging  
  char *buf = (char*)malloc(5000);
  fread(buf,5000,1,fp);
  free(buf);
  return 0;
}

I can also replace the fread call with

fread(buf,1000,5,fp);

What is better? And why? In matters of optimization, I understand the return value is different.

raptor0102
  • 309
  • 2
  • 15
  • 1
    These two pieces of code do different things. What are you trying to accomplish? – Nayuki Nov 07 '15 at 21:58
  • 1
    Do you mean the difference between `fread(buf,1000,5,fp)` and `fread(buf,5000,1,fp)`? – cadaniluk Nov 07 '15 at 21:58
  • @cad yes. Im trying to work with the data in buf and manipulated him.. some to memcpy, and some to write another file.. – raptor0102 Nov 07 '15 at 22:00
  • 2
    @cad: that was a pretty high value edit :) +1 – Chris Beck Nov 07 '15 at 22:03
  • 1
    I've never understood that either; most developers can multiply two numbers together without the dubious support of passing two 'size' parameters. – Martin James Nov 07 '15 at 22:05
  • Same thing goes for `void* calloc(size_t nmemb, size_t size)`. – cadaniluk Nov 07 '15 at 22:06
  • @MartinJames And, in case they can't, resorting to the compiler also works. :O – cadaniluk Nov 07 '15 at 22:11
  • @cad: Except that `calloc` never allocates less than what you requested (it allocates all of it or fails). For both `calloc` and `fread`, the behavior isn't defined in terms of multiplying the two arguments. Te difference matters when the product would wrap around. – Keith Thompson Nov 08 '15 at 00:27

2 Answers2

6

If you exchange those two arguments, you still request to read the same number of bytes. However the behaviour is different in other respects:

  • What happens if the file is shorter than that amount
  • The return value

Since you should always be checking the return value of fread, this is important :)

If you use the form result = fread(buf, 1, 5000, fp);, i.e. read 5000 units of size 1, but the file size is only 3000, then what will happen is that those 3000 bytes are placed in your buffer, and 3000 is returned.

In other words you can detect a partial read and still use the partial result.

However if you use result = fread(buf, 5000, 1, fp);, i.e. read 1 unit of size 5000, then the contents of the buffer are indeterminate (i.e. the same status as an uninitialized variable), and the return value is 0.

In both cases, a partial read leaves the file pointer in an indeterminate state, i.e. you will need to fseek before doing any further reads.

Using the latter form (i.e. any size other than 1) is probably best used for when you either want to abort if the full size is not available, or if you're reading a file with fixed-size records.

M.M
  • 138,810
  • 21
  • 208
  • 365
1

I've always found it best to use 1 for the element size. If fread() can't read a complete element at the end of the file, it will skip the last, partial element. This is not desirable when the last element is short. On the other hand, using 1 for element size does no harm.

Sample code that prints itself and demonstrates this behavior:

#include <stdio.h>
#include <string.h>

#define SIZE 100
#define N 1

int main()
{
    FILE *fin;
    int ct;
    char buf[SIZE * N + 1];

    fin = fopen("size_n.c", "r");

    while (1) {
        ct = fread(buf, SIZE, N, fin);
        if (!ct)
            break;
        buf[ct * SIZE] = '\0';
        fputs(buf, stdout);
    }
}
Tom Zych
  • 13,329
  • 9
  • 36
  • 53
  • In this example – since you are reading `char`s – the element size actually *is* 1. But if you were reading, say, an array of `double`s, not reading the “last partial element” is probably appropriate. – 5gon12eder Nov 07 '15 at 22:29
  • @5gon12eder: Well, yes, I suppose you're right. I've always strenuously resisted reading machine-specific formats like that, though, due to portability. – Tom Zych Nov 07 '15 at 22:30
  • Yes, that's generally a good idea but I think the very reason the `fread` function accepts the size parameter is for reading in types other than `char`s. And doing this is not always bad. For example, if you have two processes running on the same machine that communicate via a pipe, using a binary format is perfectly fine. – 5gon12eder Nov 07 '15 at 22:34