6

Microsoft extensions to C and C++:

To perform the same cast and also maintain ANSI compatibility, you can cast the function pointer to a uintptr_t before you cast it to a data pointer:

int ( * pfunc ) ();
int *pdata;
pdata = ( int * ) (uintptr_t) pfunc;

Rationale for C, Revision 5.10, April-2003:

Even with an explicit cast, it is invalid to convert a function pointer to an object pointer or a pointer to void, or vice versa.

C11:

7.20.1.4 Integer types capable of holding object pointers

Does it mean that pdata = ( int * ) (uintptr_t) pfunc; in invalid?

As Steve Summit says:

The C standard is written to assume that pointers to different object types, and especially pointers to function as opposed to object types, might have different representations.

While pdata = ( int * ) pfunc; leads to UB, it seems that pdata = ( int * ) (uintptr_t) pfunc; leads to IB. This is because "Any pointer type may be converted to an integer type" and "An integer may be converted to any pointer type" and uintptr_t is integer type.

TylerH
  • 20,799
  • 66
  • 75
  • 101
pmor
  • 5,392
  • 4
  • 17
  • 36
  • 6
    Microsoft documentation is not helpful regarding the C standard. It does not even refer to the standard correctly; there is no ANSI C standard now. – Eric Postpischil Feb 07 '22 at 11:56
  • 1
    You have been working with the C standard long enough now to know that “invalid” is not applicable here. The C standard does not make many things invalid. It leaves many things undefined, but that does not mean programs may not use them. C implementations may extend the C standard by defining additional behaviors. – Eric Postpischil Feb 07 '22 at 11:59
  • 2
    Also note, in C11 "Common Extensions": **J.5.7**: *1 A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4)* and *2 A pointer to a function may be cast to a pointer to an object or to void, allowing a function to be inspected or modified (for example, by a debugger) (6.5.4)*. – Adrian Mole Feb 07 '22 at 11:59
  • 1
    It's not invalid, it is undefined. It means that the C Standard does not guarantee any behavior in this case. But from your link it seems that the Microsoft compiler guarantees the expected behavior. – mch Feb 07 '22 at 12:01
  • If you use MS compiler you probably do not care about 100% portability. It is defined in MS but not defined in standard portable C. – 0___________ Feb 07 '22 at 12:45
  • 2
    The mystery here is why Microsoft is recommending the detour via `uintptr_t`. If function and data pointers are interconvertible on a given platform, they can be interconverted directly, perhaps with explicit cases to silence warnings. But if they're not interconvertible, converting to `uintptr_t` first isn't going to help a bit. – Steve Summit Feb 07 '22 at 12:52
  • pmor, Why do you want to save a function pointer in a `int *`? What is the use case? – chux - Reinstate Monica Feb 07 '22 at 13:07
  • Microsoft is recommending the detour via `uintptr_t` to be sure that the `unsigned int` used in case of **64 bits compilations** is an `unsigned long long int` (64bits!) instead of a 32bits integer (standard `int` and `unsigned int` for MSVC). This double casting will trigger warnings in case of malformed assignements. This is acommon arrangement MS uses in its headers. – Frankie_C Feb 07 '22 at 14:29
  • @Frankie_C Are you saying that `pdata = (int *)(uintptr_t)pfunc;` is safer than `pdata = (int *)(unsigned int)pfunc;`? But why use either? Why not use `pdata = (int *)pfunc;`, and eliminate the unsigned ints altogether? – Steve Summit Feb 07 '22 at 15:09
  • @EricPostpischil Any idea, why Rationale uses “invalid” instead of "undefined"? Terms such as "undefined", "indeterminate", "unspecified", "unknown" may be non-trivial to clearly understand and distinguish. Example: C2x: "the second declares y to be an array of int of unspecified size", while it should be (I think) "... unknown size". – pmor Feb 07 '22 at 21:36
  • @chux-ReinstateMonica I don't want to save a function pointer in a `int *`. I don't have a use case. I was just reading "Microsoft extensions to C and C++" and was surprised that "To perform the same cast and also maintain ANSI compatibility, you can cast the function pointer to a `uintptr_t` before you cast it to a data pointer" while the Rationale says that "it is invalid to convert a function pointer to an object pointer or a pointer to void, or vice versa" (actually "undefined" rather than "invalid"). – pmor Feb 07 '22 at 21:39
  • @chux-ReinstateMonica Extra: as for `dlsym`: I think that if a system does support the `dlsym`, then it was tested. Hence, the `dlsym` _can_ be used. – pmor Feb 07 '22 at 21:45
  • @EricPostpischil Another non-trivial thing: if the value of `x` is unspecified, then the expression `x != x` evaluates to an unspecified value, too. Details: [DR #260](http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm), [DR #451](http://www.open-std.org/Jtc1/sc22/WG14/www/docs/dr_451.htm). – pmor Feb 07 '22 at 21:54
  • 1
    @SteveSummit It isn't safer, it adds some checks triggering warning or errors in case. If the system you're compiling on, cannot support conversion from function pointer to object pointer it will generate an error. – Frankie_C Feb 08 '22 at 14:32
  • @EricPostpischil See UPD2. How can you comment on that? – pmor Feb 15 '22 at 21:04
  • 1
    [Equivalent of uintptr_t/intptr_t for pointers to functions?](https://stackoverflow.com/q/36403711/995714), [Can you cast a "pointer to a function pointer" to void*](https://stackoverflow.com/q/53033813/995714), [Why can't I cast a function pointer to (void *)?](https://stackoverflow.com/q/36645660/995714) – phuclv Feb 22 '22 at 04:45

4 Answers4

2

Given the definitions

int (*pfunc)();
int *pdata;

, the assignments

pdata = (int *)pfunc;
pdata = (int *)(uintptr_t)pfunc;

are, IMO, equivalent. On a platform where data pointers are of the same size as, or larger than, function pointers, both assignments will work as desired. But on a platform where data pointers are smaller than function pointers, both assignments will inevitably scrape off some of the bits of the function pointer, resulting in a data pointer which can not be converted back to the original function pointer later.

In particular, I believe that both assignments are equivalent despite the presence of the (uintptr_t) cast in the second one. I believe that cast accomplishes precisely nothing.

On a platform where data pointers are smaller than function pointers, and where type uintptr_t is of the same size as data pointers, in the assignment

pdata = (int *)(uintptr_t)pfunc;

, the cast to (uintptr_t) will scrape off some of the bits of pfunc's value.

On a platform where data pointers are smaller than function pointers, and where type uintptr_t is of the same size as function pointers, in the assignment

pdata = (int *)(uintptr_t)pfunc;

, the cast to (int *) will scrape off some of the bits of pfunc's value.

In both cases pdata will end up with only some fraction of pfunc's original value.

(Here I disregard the possibility of architectures with padding bits or the like. On some bizarre, hypothetical platform where function pointers are larger than data pointers, but the extra bits are always 0, both assignments would again work.)

(I've also disregarded the possibility that int * is a different size than void *. I'm not sure whether that would affect the answer, whether a "detour" via void * is more or less un- or necessary when attempting a conversion from int (*)() to int *.)

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
2

Casting to uintptr_t only works if this type is defined, which may not be the case on legacy systems using ancient compilers. Note however that uintptr_t must be large enough for any object pointer, especially char * or void *, but may be smaller than function pointers. Such architectures are rare today and Microsoft compilers probably no longer support them, but they were common place in the 16-bit world (MS/DOS, Windows 1, 2 and 3.x) where the medium model had 32-bit segmented code pointers and 16-bit data pointers.

Note also that the C Standard allows for int * and void * to have a different size and representation albeit no Microsoft compiler supports such exotic targets.

On current systems with modern compilers, all data pointers and code pointers have almost always the same size and representation. This is actually a requirement for POSIX compatibility, so the recommendation to use an intermediary cast to (uintptr_t) is valid and effective.

For complete portability, if the goal is to pass a function pointer via an opaque void *, you can always allocate an object of the proper function pointer type, initialize it with pfunc and pass its address:

// setting up the void *
int (*pfunc)();
void *pdata = malloc(sizeof pfunc);
memcpy(pdata, &pfunc, sizeof pfunc);

// using the void *
int (**ppfunc)() = pdata;
(*ppfunc)();     // equivalent to (**ppfunc)();
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • "Casting to unsigned long long seems a safer choice to silence the warning than using uintptr_t, an optional type that may be smaller than the function pointer." In what case is uintptr_t going to be smaller than a function pointer? The whole point of Microsoft's `*ptr_t` types is that they are the same size as a pointer for the process. On 32-bits they are 32-bit wide, and on 64-bits they are 64-bit wide. It's a cross platform way in Visual C++ of describing an unsigned int that is the same width as a pointer. – Joseph Willcoxson Feb 07 '22 at 16:45
  • 3
    @JosephWillcoxson: If both function pointers and data pointers have the same size, why do they complain about an explicit cast from one to the other? Previous Microsoft systems did have memory models with different sizes for code and data: 16-bit compact model had 16-bit function pointers and 32-bit data pointers, whereas medium model had 32-bit function pointers and 16-bit data pointers. `uintptr_t` would be `unsigned int` on this model, a 16-bit type. – chqrlie Feb 07 '22 at 17:07
  • Re: "to silence the warning": which warning? I see no warnings under `/std:c11 /Za` except `warning C4700: uninitialized local variable 'pfunc' used`. – pmor Feb 07 '22 at 22:02
  • Re: `memcpy(pdata, pfunc, sizeof pfunc);`: the `memcpy` has `const void * restrict s2`, so you'd need `(void*)pfunc`, which is undefined. – pmor Feb 07 '22 at 22:12
  • 1
    @chux-ReinstateMonica: you are correct, `&pfunc` must be passed to `memcpy`. – chqrlie Feb 08 '22 at 08:55
  • This question is about C11, not "ancient compilers". – einpoklum Feb 08 '22 at 08:56
  • 1
    @pmor: sorry, I misread the question: the C Standard makes it **invalid** to use a direct cast which should require a compiler error. I amended the answer. – chqrlie Feb 08 '22 at 08:57
  • @einpoklum: I just wanted to introduce some background for the the C Standard rationale. My answer boils down to: `int *pdata = (int *) (uintptr_t) pfunc;` should be fine for current systems and I propose a fully portable alternative. – chqrlie Feb 08 '22 at 09:09
  • Does `int (**ppfunc)() = pdata;` lead to UB? Pointers (to pointers) to function as opposed to object types, might have different representations. Also see UPD2. – pmor Feb 15 '22 at 21:11
  • @pmor: a pointer to function is an object type. `ppfunc` is a pointer to an object type, `pdata` is a `void*` returned by `malloc()`, the memory returned by malloc is guaranteed to be properly aligned for any object type, so the cast should be OK. – chqrlie Feb 15 '22 at 21:57
  • About `&pfunc`. Consider an implementation with 48 bit function pointer and 32 bit object pointer. Consider that `pfunc` has 36 bits set. Consider that `&pfunc` has 36 bits set. What will happen if `&pfunc` is implicitly converted to `void*` (32 bit)? – pmor Feb 15 '22 at 22:18
  • @pmor: in your example, `&pfunc` is 32-bit object pointer pointing to a 48-bit object. `pfunc` lives in the data space, its address has 32-bits. – chqrlie Feb 15 '22 at 23:03
  • And if 48-bit object is located "out of 32-bit range"? – pmor Feb 16 '22 at 00:27
  • @pmor: if the pointer is located outside the data space, it is not an object. Some compilers have extended qualifiers such as `near`, `far` or `huge` to qualify objects in peculiar data space. It used to be common place on desktop systems in the 80s and 90s. The question here pertains to more regular architectures. – chqrlie Feb 16 '22 at 07:27
2

Is conversion of a function pointer to a uintptr_t / intptr_t invalid?

No. It may be valid. It may be undefined behavior.


Conversion of a function pointer to ìnt* is not defined. Nor to any object pointer. Nor to void *.
pdata = ( int * ) pfunc; is undefined behavior.

Conversion of a function pointer to an integer type is allowed, with restrictions:

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type. C17dr 6.3.2.3 6

Also integer to a pointer type is allowed.

An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation. C17dr 6.3.2.3 6

void * to integer to void * is defined. Object pointer to/from void* is defined. Then the optional (u)intptr_t types are sufficient for round-trip success. Yet we are concerned about a function pointer. Often enough function pointers are wider than an int *.

Thus converting a function pointer to int * only makes sense through an integer type, wider the better.

VS may recommend through the optional type uintptr_t and is likely sufficient if information is lossless on other platforms. Yet uintmax_t may afford less loss of information, especially in the function pointer to integer step, so I pedantically suggest:

pdata = ( int * ) (uintmax_t) pfunc;

Regardless of the steps taken, code is likely to become implementation specific and deserves guards.

#ifdef this && that
  pdata = ( int * ) (uintmax_t) pfunc;
#else
  #error TBD code
#endif
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
2

Migrating the solution from the question to an answer:

Here is the answer from Microsoft:

Q: How exactly "cast the function pointer to a uintptr_t before you cast it to a data pointer" leads to maintaining "ANSI compatibility"?

A: Without the cast to uintptr_t it’s possible that the code will fail to compile with other compilers, even if they use the same pointer model. For example: https://gcc.godbolt.org/z/9EjTe1s4x - if you add the uintptr_t it compiles without warnings/errors.

TylerH
  • 20,799
  • 66
  • 75
  • 101