__safe_cmp macro in linux/minmax.h

Question

The __safe_cmp is defined as in minmax.h

#define __safe_cmp(x, y) \
    (__typecheck(x, y) && __no_side_effects(x, y))
#define __no_side_effects(x, y) \
    (__is_constexpr(x) && __is_constexpr(y))
#define __typecheck(x, y) \
    (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
#define __is_constexpr(x) \
    (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))

From my understanding,

This macro is used to do check during compiling so that no un-compatible surprise when runtime
If no warning during compiling, the result of this macro should always be 1 (true)

I also checked the discussion about __is_constexpr in Linux Kernel's __is_constexpr Macro

Still have some items are not clear,

the returned value by __typecheck is wrapped by sizeof, the aim is only to retain the result as constant expression? which means the == operator doesn't returned as constant expression? In c11 standard 6.5.9, it says the returned type is int

The result has type int. For any pair of operands, exactly one of the relations is true

does (typeof(x) *)1 == (typeof(y) *)1 always return 1? In 6.5.9, it also says both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, this means if the value of pointers are same, the pointers should be equal? even if the types are different? I tested (int, long), (int, struct), (int, void), (long, int*) and so on with this expression, all are returned 1 under linux, is there any false case for this condition?

5 Otherwise, at least one operand is a pointer. If one operand is a pointer and the other is a null pointer constant, the null pointer constant is converted to the type of the pointer. If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.

6 Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space

if item 1 and 2 are correct, sizeof(equal expr) is to covert "equal expr"'s result to integer constant expression, why need to do this? I would thought sizeof is to covert the result to true(even when false) so that the macro will always return true. I notice the comments said to keep constant expression avoid VLA warning? Is there any example if none constant expression result retained?
in __is_constexpr, "(long)(x) * 0l", when x is constant expression, it will return null pointer constant and the expression will return (int *)null (using type of third operand), mentioned in Linux Kernel's __is_constexpr Macro from 6.5.15.6, if the expression is something like (long)(x++) * 0l, which should also be null, but it's not a null pointer constant? so that the result will be (void *)null? and then sizeof(*(void *)) == 1 (sizeof(void)). How to verify this difference from code level since the printf of %p seems to be same without sizeof

#include <stdio.h>

#define check(x)      ((void *)((long)(x) * 0l))
#define _check(x)     (8 ? check(x) : (int *)8)
#define __check(x)    sizeof(*(_check(x)))

int main(int args, char** argv)
{
  int x = 0;

  printf("%p\n", check(x++));    // (nil)
  printf("%p\n", check(1));      // (nil)

  printf("%p\n", _check(x++));   // (nil)
  printf("%p\n", _check(1));     // (nil)

  printf("%ld\n", __check(x++)); // 1
  printf("%ld\n", __check(1));   // 4

  return 0;
}

Another question is why using third operand's type as returned type when second operand is null pointer constant in c11. NULL is defined as ((void *)0) in linux and doesn't point to any real object, why not use void* as the returned type directly?
There is also a function called typecheck defined in typecheck.h. what is the different between the 2 functions, the implementation is also to compare with the pointer between x and y' types, generate warning/error during compiling and returned always 1 which it's a constant expression.

#define typecheck(type,x) \
({    type __dummy; \
  typeof(x) __dummy2; \
  (void)(&__dummy == &__dummy2); \  // always false?
  1; \
})

BTW, is there any doc or book to explain these kind questions except c11?

Appreciate for your help!

===============================================================

Update:

The question about __is_constexpr,

int main(int args, char** argv)
{

    int x = 1;
    int a = (long)x * 0l;
    int b = (long)1 * 0l;

    return 0;
}
0000000000001129 <main>:
    1129:   f3 0f 1e fa             endbr64 
    112d:   55                      push   %rbp
    112e:   48 89 e5                mov    %rsp,%rbp
    1131:   89 7d ec                mov    %edi,-0x14(%rbp)
    1134:   48 89 75 e0             mov    %rsi,-0x20(%rbp)
    1138:   c7 45 f4 01 00 00 00    movl   $0x1,-0xc(%rbp)
    113f:   c7 45 f8 00 00 00 00    movl   $0x0,-0x8(%rbp)
    1146:   c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
    114d:   b8 00 00 00 00          mov    $0x0,%eax
    1152:   5d                      pop    %rbp
    1153:   c3                      ret

The above code shows that the 2 expressions were translated to 0 during compiling, seems to be the same.

In gcc (version 11.3.0), below code found in c-typeck.c,

static bool
null_pointer_constant_p (const_tree expr)
{
  /* This should really operate on c_expr structures, but they aren't
     yet available everywhere required.  */
  tree type = TREE_TYPE (expr);
  return (TREE_CODE (expr) == INTEGER_CST
      && !TREE_OVERFLOW (expr)
      && integer_zerop (expr)
      && (INTEGRAL_TYPE_P (type)
          || (TREE_CODE (type) == POINTER_TYPE
          && VOID_TYPE_P (TREE_TYPE (type))
          && TYPE_QUALS (TREE_TYPE (type)) == TYPE_UNQUALIFIED)));
}

According to debug result with cc1,

int a = (long)x * 0l is treated as MULT_EXPR when checking TREE_CODE (expr), false
int b = (long)1 * 0l is treated as INTEGER_CST when checking TREE_CODE(expr), true

And the second expression is NULL pointer constant during compiling and then seems to be matched with the rules of ternary conditional operator.

I'm not familiar with gcc internal, not sure if the analysis is right.

===============================================================

Update based on Nate Eldredge's answer. Really thanks a lot for Nate Eldredge's patient to answer all questions in detals.

The purpose of using sizeof is to avoid the expr evaluated

I would thought the assembly results of 2 expressions are same after compiled by gcc on linux platform and then assumed there should no difference after that

According to the answer. The evaluated result of the "==" expression is implementation-defined, may be optimized by compiler itself and not be obliged to. The result using gcc on linux platform may seem to be OK and may be not the same as other compiler and platform

The rule here is, if the result is not determinant, more important if the result doesn't need, just keep the expr not evaluated. Such as, using sizeof and so on. The line 3 in typecheck is false and may also be run in some cases even in runtime

for __is_constexpr

sizeof is same as mentioned above and also distinguish the pointer type, sizeof((void)) and sizeof((int))

I would thought ((long)(x) * 0l) should be optimized by compiler, calculated to 0l, converted to (void*)0 first and then the expr should also be null pointer constant, according to the debug in gcc, the expr is not a constant expression since x is not determinant even the result seems to be 0l and then the expression can be used to distinguish constant expression or not

Why using the type of third operand instead of void* when null pointer constant returned of second operand

Per the answer, also to emit warning when type incompatible

I think I need to think more about 6.5.15p8, and the raw description is if one of the operands is null pointer constant and will use the type of another. Otherwise, if a null pointer returned, such as "(void*)((long)(x++) * 0l)", it seems to return "(void*)0" directly. I would thought, if null pointer constant, return it directly(only not to use the type of another operand) and if false then (int*) returned directly

also need to read more careful on 6.2.7

Thanks again for Nate Eldredge's patient and kindly reply.

`((void *)((long)(x) * 0l))`, this multiplies x by 0 which is always going to be 0, then casts to a void pointer. Isn't that NULL. `(long)(x) * 0l` is always going to be a long 0 no? — xihtyM, Mar 14 '23 at 12:52
@xihtyM, also confused. In code __check(x++) which passed none const expression to the macro, printed nil(%p) without sizeof and the value is 1 when applied sizeof (that should be sizeof(void)). The result is correct of __is_constexpr to distinguish const/none-const expression. I would thought the result of macro _check is also NULL(the const pointer const), even if x is none const expression or any misunderstanding here? — qingdaojunzuo, Mar 14 '23 at 13:06
`sizeof(void)` is undefined, although on gcc it says 1. Because void is an incomplete type. — xihtyM, Mar 14 '23 at 13:13
@xihtyM, yes. In c11, it mentions if the second operand is null pointer constant, the result type of condition expression should be the type of third operand. In above macro, the third one is (int \*) and when passed const expression (1), it seems to return sizeof(\*(int \*) == 4, on the other hand, passed x++, the result is 1, seems to be sizeof(\*(void \*) == 1 with gcc. Not sure the understanding is right here — qingdaojunzuo, Mar 14 '23 at 13:19
It's immaterial what `typeof(x) *)1 == (typeof(y) *)1` yields; it's inside `sizeof` so it is not evaluated. The point is that if `x` and `y` are not of compatible types, then this code will cause gcc to emit a warning (`comparison of distinct pointer types lacks a cast`). But regardless, the type of `typeof(x) *)1 == (typeof(y) *)1` is `int`, since that's what the `==` operator yields. So the value of the expression is `!!sizeof(int)` which is always 1. — Nate Eldredge, Mar 16 '23 at 18:34
So the `__typecheck` macro evaluates to 1 no matter what, it's just a question of whether it triggers a compiler warning or not. — Nate Eldredge, Mar 16 '23 at 18:35
@NateEldredge, thanks for the answer. Yes, I have read related info and agree with you. My confusion is, 1. typeof(x) *)1 == (typeof(y) *)1 should also warn "lack a cast" as line 3 in typecheck macro(no sizeof). 2. results seem to be translated to immediate operand during compiling ($0x4 with sizeof and $0x1 without sizeof), no different during runtime? 3. the expr only evaluates during compiling when no sizeof, if result should always be int 1(or 0 in some cases?), why not use it directly? 4. In typecheck, I would thought line 3 might return false and then use constant 1 to keep always true — qingdaojunzuo, Mar 17 '23 at 02:09
@NateEldredge, the only primary different here I think it's the type of the result, as mentioned by you, without sizeof it returns int and with sizeof it returns constant 1. so is there any rules of this, anything I still misunderstanding or any side effect? Thanks — qingdaojunzuo, Mar 17 '23 at 02:11
I assume the purpose of using `sizeof` in `__typecheck` is to **ensure** that the `(typeof(x) *)1 == (typeof(y) *)1` expression is **not** evaluated. Without `sizeof`, the compiler might emit code to actually perform the comparison at runtime, which we do not want. (In practice it would presumably be optimized out, but we have no formal guarantee of that.) Then the `!!` just ensures that the expression's value is always specifically 1 (rather than 4 or some other number). — Nate Eldredge, Mar 17 '23 at 15:07
The alternate `typecheck` macro has this same problem: we have to rely on the compiler optimizing out the evaluation of the comparison expression, instead of having a language guarantee that it is not evaluated at all. It's probably fine in practice though. — Nate Eldredge, Mar 17 '23 at 15:12

score 2 · Accepted Answer · answered Mar 17 '23 at 18:06

There are an awful lot of questions in your post. I will try to address them as best I can. Standard references are to C17 N2176 as that's the version I have handy, but I think C11 should be the same in these respects.

First, for the __typecheck macro, the key point to remember is that, by definition in the language, the operand of sizeof is not evaluated (6.5.3.4p2). Therefore it is irrelevant what (typeof(x) *)1 == (typeof(y) *)1 would evaluate to, as it is not evaluated at all.

The only point of having this expression is that it is a constraint violation if they point to incompatible types (6.5.9p2) in which case the compiler must issue a diagnostic, and this applies whether the expression is evaluated or not. In other words, we don't care about the semantics of this expression, only its syntax. We don't want it actually executed, we only want it to be parsed.

This is the reason for using sizeof. If we left it out, then (int *)1 would actually be evaluated, and the result is implementation-defined, so we don't want to rely on it. Moreover, it doesn't point to any object and is not a null pointer, and the Standard (6.5.9p6) is somewhat unclear on what happens when you apply == to such a pointer. By putting it inside sizeof we don't have to find out.

So we don't care about the value of the expression (typeof(x) *)1 == (typeof(y) *)1; the behavior of the typeof macro only depends on its type, which is certainly int. Therefore, assuming that x and y do in fact have compatible types, then the value of __typecheck(x,y) is simply !!sizeof(int), which is certainly 1 because sizeof(int) cannot be zero. I assume the only reason for the !! is for predictability, to ensure that the expression always has the specific value 1 no matter what platform we are on.

As to your question about whether (typeof(x) *)1 == (typeof(y) *)1 is a constant expression (in the sense of 6.6): no, it is not. A pointer expression can only appear in a constant expression if it is an address constant, which must either be "a null pointer, a pointer to an lvalue designating an object of static storage duration, or a pointer to a function designator". The pointer (typeof(x) *)1 is not any of those. However, as noted, this question is irrelevant to understanding how __typecheck works.

The use of sizeof avoids the potential problems that your alternative typecheck has. Here (void)(&__dummy == &__dummy2); is actually evaluated. Now in practice the compiler optimizes it out, but it's not obliged to do so. A stupider compiler might actually allocate stack for the variables __dummy and __dummy2, execute instructions to compare their addresses, and then just ignore the result. This would be harmless but inefficient. By having it inside sizeof we avoid that possiblity.

Notice that in your typecheck, (&__dummy == &__dummy2) will in fact have the value 0, assuming that x does in fact have type compatible with type (and if it does not then the program is ill-formed and all bets are off). The objects __dummy and __dummy2 are obviously not the same object, so pointers to them must compare unequal under 6.5.9p6. However, this is again irrelevant to the behavior of the macro, since the value of (&__dummy == &__dummy2) is not used in determining the value of typecheck(x,type). Under gcc's rules for its statement expressions extension (not part of standard C!), the value of the ({ ... }) expression is the value of the expression 1; at the end, which is simply the constant 1.

For __is_constexpr, we are taking advantage of (6.5.15p6):

If both the second and third operands are pointers or one is a null pointer constant and the other is a pointer, the result type is a pointer to a type qualified with all the type qualifiers of the types referenced by both operands. Furthermore, if both operands are pointers to compatible types or to differently qualified versions of compatible types, the result type is a pointer to an appropriately qualified version of the composite type; (*) if one operand is a null pointer constant, the result has the type of the other operand; (**) otherwise, one operand is a pointer to void or a qualified version of void, in which case the result type is a pointer to an appropriately qualified version of void.

If x is an integer constant expression, then by 6.6p6, (long)x * 0l is also an integer constant expression, and its value is zero. It is therefore a null pointer constant under 6.3.2.3p3, and thus so is ((void *)((long)(x) * 0l)). So by the clause I marked (*), 8 ? ((void *)((long)(x) * 0l)) : (int *)8 has type int *. Thus sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8))) has the value sizeof(int), and so __is_constexpr(x) evaluates to 1.

If x is not an integer constant expression, then ((void *)((long)(x) * 0l)) is not a null pointer constant. So the second operand has type (void *), the third has type (int *) and neither one is a null pointer constant. Thus under clause (**), the result has type void *. Therefore __is_constexpr(x) reduces to sizeof(int) == sizeof(void). Now sizeof(void) is not defined in standard C, but gcc, as an extension, defines it to be 1. So provided that sizeof(int) != 1, which is presumably true on every system supported by Linux, the result of __is_constexpr(x) is 0.

As per your tests, when you do check(1), the result is a null pointer constant as explained above. When you do check(x++), you get the result of converting the non-constant value 0 to (void *). This is not a null pointer constant, so 6.3.2.3p3 does not apply, and we are left with 6.3.2.3p5 which says the result is implementation-defined. gcc defines the behavior in this case as the pointer with all bits zero, which happens to be a null pointer on this platform. So check(1) and check(x++) do both turn out to be null pointers, on this implementation, but for very different reasons.

Again, in the __is_constexpr macro, we care about syntax instead of semantics. We don't care about the runtime value of the expression ((void *)((long)(x) * 0l)), and it's not going to be evaluated anyway because it's inside sizeof. We care about whether syntactically it is a null pointer constant or not.

One might ask why ?: has the behavior defined in 6.5.15p6. You suggested that since NULL is defined in Linux as (void *)0, then it would make sense for flag ? p : q to have the type void *. I don't think this would actually be as useful in practice. Consider for instance the following buggy code:

int i;
double *pd;
pd = flag ? NULL : &i;

Under the actual C rules, flag ? NULL : &i has type int *. So we are assigning an int * to a double * and get a diagnostic because they are incompatible. That's good, the compiler found our bug.

Under your rule, flag ? NULL : &i has type void * and the code is well-formed, so it compiles without error. When flag is true, we get the effect of pd = (void *)0, so that pd becomes a null pointer: well and good. But when flag is false, we get the effect of pd = (void *)&i, and we likely get a crash or other UB if we later dereference pd.

Really thanks a lot for your patient to answer so many questions in details :). Much clear now. Sometimes, it's tricky to understand logical and purpose for new hand(c/kernel) encountering complex expr. IMO, the way to analysis is based on correctness and performance during compiling and runtime. In this case, when I see 2 exprs having same outputs after compiling, I assumed the action should always be determinant during compiling and no harm in runtime. Mentioned by u. should not rely on it. The rule is if you can't be sure of the result and no need to run, then avoid running. Thanks again — qingdaojunzuo, Mar 18 '23 at 03:24

__safe_cmp macro in linux/minmax.h

1 Answers1