When you write the sizeof
operation first you usually insure that the calculation is done with at least size_t
math. Let's determine what this means.
Problem with placing sizeof(X)
at last:
Imagine the scenario, h
has the value 200000
and w
has the value 50000
(maybe got by accident).
Assuming that the maximum integer value an int
can hold is 2147483647
, which is common (The exact implementation-defined value you can read from the macro INT_MAX
- header <limits.h>
), Both are legit values an int
can hold.
If you now use malloc( h * w * sizeof(*p) );
, the part h * w
is calculated first as the evaluation order of an arithmetic expression goes from left to right. With this, You would get a signed integer overflow as the result 10000000000
(10 billion) isn't possible to be represented by an int
.
The behavior of a program in which integer overflow happens is undefined. The C standard states integer overflow even as example for undefined behavior at the specification of it:
3.4.3
1 undefined behavior
behavior, upon use of a non portable or erroneous program construct or of erroneous data, for which this document imposes no requirements
2 Note 1 to entry: Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
3 Note 2 to entry: J.2 gives an overview over properties of C programs that lead to undefined behavior.
4 EXAMPLE An example of undefined behavior is the behavior on integer overflow.
Source: C18, §3.4.3
Placing sizeof(X)
at the first place instead:
If you use the sizeof
operation first instead, like malloc( sizeof(*p) * h * w );
, you usually don't have the risk to get an integer overflow.
This because of two reasons.
sizeof
gains a value of the unsigned integer type size_t
. size_t
has on the most modern implementations a higher integer conversion rank and size than an int
. Common values: sizeof(size_t) == 8
and sizeof(int) == 4
.
This is important for point 2., something called integer promotion (arithmetic conversion) which occurs in arithmetic expressions.
In expression often happen automatic type conversions of operands. This is called integer or implicit type promotion. For more information about this you can take a look at this useful FAQ.
For this promotion, the conversion rank of an integer type is important since the type of the operand of the less integer conversion rank get promoted to the type of the operand of the higher integer conversion rank.
Looking at the exact phrases from the C standard:
"Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank."
Source: C18, §6.3.1.8/1
Also conversions of the signedness can happen here and that's what is important in this case as described later on.
"Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type."
....
"Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type."
Source: C18, §6.3.1.8/1
If size_t
has higher or at least equal integer conversion rank than int
and int
isn't be able to represent all values of size_t
(which is fulfilled since int
usually has less size than size_t
as said earlier), the operands h
and w
of type int
get promoted to type size_t
before the calculation.
Importance of the signedness conversion to an unsigned integer:
Now you might ask: Why is the signedness conversion to an unsigned integer important?
There are also two reasons here by which the second is more important but for the sake of completeness I want to cover both.
An unsigned integer has always a wider positive range than an signed integer of the the same integer conversion rank. This is because a signed integer also always need to represent a negative range of values. An unsigned integer has no negative range and can therefore represent almost twice more positive values than an signed integer.
But even more important:
An unsigned integer can never overflow!
"A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type."
Source: C18, §6.2.5/9 (emphasize mine)
That's the reason why placing the sizeof
operation first as in malloc( sizeof(*p) * h * w );
is more safe.
However, in the case you will get over the limit by using an unsigned integer, due to wrapping around the allocated memory would be too small to use it for the desired purpose. Accessing non-allocated memory would also invoke undefined behavior.
But nonetheless it gives you protection from getting undefined behavior at the call to malloc()
itself.
Side notes:
Note that placing sizeof
at the second position malloc( h * sizeof(*p) * w )
would technically achieve the same effect although it might decrease readability.
If the arithmetic expression in the call to malloc()
only has one or 2 operands (e.g. sizeof(x)
and an int
) the order doesn't matter. But to stick to convention, I would recommend to use the same style by placing the sizeof()
always first: malloc(sizeof(int) * 4)
. In this way you don't risk accidentally forgetting it by having 2 int
operands.
Using an unsigned integer type like size_t
for h
and w
can also be a smarter alternative. It insures that no undefined overflow will happen in the first place and beside that it's more appropriate as h
and w
aren't meant to have negative values.
Related: