2

Just wondering what is the soundest way to allocate memory and fread() array data from a file in C.

First, an explanation:

int32_t longBuffer;

Now, when freading in the longBuffer, the code could go as:

fread(&longBuffer, sizeof(longBuffer), 1, fd); //version 1
fread(&longBuffer, sizeof(int32_t), 1, fd); //version 2

Among the two, I would say that version 1 is more bug-safe, since if the type of longBuffer changes (let's say to int16_t), one does not have to worry about forgetting to update the fread()'s sizeof() with the new type.

Now, for an array of data, the code could be written as:

//listing 1
int8_t *charpBuffer=NULL; //line 1
charpBuffer = calloc(len, sizeof(int8_t)); //line 2
fread(charpBuffer, sizeof(int8_t), len, fd); //line 3

However, this exhibits the problem exposed in the first example: one has to worry about not forgetting to synchronize the sizeof(<type>) instructions when changing the type of charpBuffer (let's say, from int8_t* to int16_t*).

So, one may attempt to write:

fread(charpBuffer, sizeof(charpBuffer[0]), len, fd); //line 3a

as a more bug-safe version. This should work since, after the allocation on line 2, writing charpBuffer[0] is perfectly valid.

Also, one could write:

fread(charpBuffer, sizeof(*charpBuffer), len, fd); //line 3b

However, trying to do the same for memory allocation, such as:

charpBuffer = calloc(len, sizeof(charpBuffer[0])); //line 2a

while better in syntax, exhibits undefined behaviour because, at this stage, writing charpBuffer[0] results into dereferencing a NULL pointer. Also, writing:

charpBuffer = calloc(len, sizeof(*charpBuffer)); //line 2b

exhibits the same problem.

So, now the questions:

  1. Are the lines of code "line 2b" and "line 3b" correct (ignore the undefined behaviour for this question) or there are some tricks that I miss w.r.t. their "wiser" counterparts such as "line 2a/3a" and "line 2/3"?

  2. What would be the most bug-safe way to write the "listing 1" code, but avoiding any form of undefined behaviour?

EDITS (in order to clarify some aspects):

The discussion took wrong direction. The question of compile time vs run time is one thing (and I would like to have a standard guarantee for this one, too, but it is not the topic). And the question of undefined behaviour for sizeof(NULL dereferencing) is another. Even if at compile time, I am not convinced that this is guaranteed by the standard to not result in UB. Does the standard provide any guarantees?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
user1284631
  • 4,446
  • 36
  • 61

2 Answers2

3

You seem to have a wrong idea about the sizeof operator. This operator is evaluated at compile-time, so the expressions that you pass to it have no chance of being evaluated while the program is running.

In the context of a sizeof operator, *charBuffer and charBuffer[0] are both safe regardless of whether they are used before or after the corresponding memory is available. It is just a way of avoiding to type the name of the type, less duplication therefore.

EDIT

As commented below, there is a notable exception to the rule that sizeof is evaluated at compile-time (although it is not relevant to the code posted in the question). Since C and C++ allow variable-length arrays as automatic variables, applying sizeof on these may actually involve some runtime overhead.

Regarding your fears about undefined behaviour, I don't think there is a ground for that since:

int vla[n]; // declare a variable-length array of length n

/* The compiler will produce code using the value of n prior to
   declaring the array to compute its size. */
x = sizeof(vla);

/* The space for the array is already available, so the expression 
   *vla is not UB anywhere (except if n is 0). Furthermore, n is 
   not involved in the computation and the operator can be evaluated at 
   compile-time. */
y = sizeof(*vla);

z = sizeof(vla[0]); // same thing
Blagovest Buyukliev
  • 42,498
  • 14
  • 94
  • 130
  • sizeof isn't evaluated at compile-time for all scenarios.. http://stackoverflow.com/a/2035292/541686 – user541686 Feb 28 '13 at 09:07
  • Are you sure about those? I know that GCC works (mostly) this way, however, I am not sure that the standard guarantees that behaviour. – user1284631 Feb 28 '13 at 09:11
  • @BlagovestBuyukliev: This example: http://stackoverflow.com/a/2709634/1284631 compiles under gcc on my machine and it works as expected (the sizeof operator prints the length of the array, which is known only at runtime). – user1284631 Feb 28 '13 at 09:17
  • @axeoth: Since VLA's are always automatic variables, their space is already reserved at the time of their declaration. So it is safe to use these expressions in conjunction with a `sizeof` operator and a VLA. – Blagovest Buyukliev Feb 28 '13 at 09:29
  • I understand you point of view. However, I want to know *how much safe* it is. I know that works in 99% of the cases. But I want to know if it works in 100%. And that guarantee is provided only by the standard. If it is there, it is OK for me. Compilers may adhere to the "as expected" practice, but as long as it is not in the standard, it is not good enough for me. I am chased by the standard enforcer. – user1284631 Feb 28 '13 at 09:33
  • @axeoth: I cannot quote the standard, but Wikipedia says essentially the same thing: http://en.wikipedia.org/wiki/Sizeof#Implementation: *In most cases, sizeof is a compile-time operator, which means that during compilation sizeof expressions get replaced by constant result-values. However, sizeof applied to a variable length array, introduced in C99, requires computation during program execution.* – Blagovest Buyukliev Feb 28 '13 at 09:38
  • The discussion took wrong direction. The question of compiletime/runtime is one thing (and I would like to have a standard guarantee for this one, too, but it is not the topic). And the question of undefined behavior for sizeof(NULL dereferencing) is another. Even if at compile time, I am not convinced that this is guaranteed by the standard to not result in UB. – user1284631 Feb 28 '13 at 09:49
2

From C99 6.5.4.3.2 (emphasis mine):

The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.

That the operand is "not evaluated" means that it's perfectly safe to access sizeof(charBuffer[0]) or sizeof(*charBuffer), because those expressions are only used for their types. Example 3 on the same page goes on to explicitly document the sizeof array / sizeof array[0] idiom for computing the number of elements in an array without any mention or implication that it wouldn't be valid for empty arrays.

user4815162342
  • 141,790
  • 18
  • 296
  • 355
  • Thank you very much. I think this clears the issue for now, and gives an acceptable reference to the standard. I will accept this. – user1284631 Feb 28 '13 at 11:54
  • Remember that C does not acknowledge the existence of 'empty arrays'; arrays of zero length are not permitted by the standard. – Jonathan Leffler Oct 20 '14 at 17:08