2

Note: This question attempts to improve what I attempted to ask here, but fell short.
Also, I have seen this, and this. They discuss similar concepts, but do not answer these questions.

My environment is Windows 10, and for testing I used two compilers, CLANG and GCC.

I am passing variables via a void * function argument, and need to convert them. I would like to get some feedback on inconsistencies I am seeing between methods for different types.

The following is a stripped-down depiction of a test function that accommodates multiple input types using a void * parameter, and an enumerated value parameter to indicate the type being passed in.:

void func(void *a, int type)
{
    switch(type) {
        case CHAR://char
            char cVar1    = (char)a;      //compiles with no warnings/errors, seems to work
            char cVar2    = *(char *)a;   //compiles with no warnings/errors, seems to work
            break;
        case INT://int
            int iVar1     = (int)a;       //compiles with no warnings/errors, seems to work
            int iVar2     = *(int *)a;    //compiles with no warnings/errors, seems to work
            break;
        case FLT://float
            float fVar1   = (float)a;      //compile error:  (a1)(b1)
            float fVar2   = *(float *)a;   //requires this method
         case DBL://double
            double dVar1  = (double)a;     //compile error: (a1)(b1)(b2)
            double dVar2  = *(double *)a;//this appears to be correct approach
            break;
    };
}  

Calling method:

int main(void)
{

    char   c = 'P';
    int    d = 1024;
    float  e = 14.5;
    double f = 0.0000012341;
    double g = 0.0001234567;

    void *pG = &g;

    func(&c, CHAR);//CHAR defined in enumeration, typical
    func(&d, INT);
    func(&e, FLT);
    func(&f, DBL);
    func(pG, DBL);

    return 0;
}

Exact error text relating to flags in comments above follows:

CLANG - version 3.3

  • (a1) - ...error: pointer cannot be cast to type 'float'

gcc - (tdm-1) 5.1.0

  • (b1) - ...error: pointer value used where a floating point value was expected
  • (b2) - ...error: pointer cannot be cast to type 'double'

For reference in discussion below

  • method 1 == type var = (type)val;
  • method 2 == type var = *(type *)val;

My results indicate that converting float & double require method 2.
But for char & int method 2 seems to be optional, i.e. method 1 compiles fine, and seems to work consistently.

questions:

  • It would seem that recovering a value from a void * function argument should always require method 2, so why does method 1 (seem to) work with char and int types? Is this undefined behavior?

  • If method 1 works for char and int, why does it not also work with at least the float type? It's not because their sizes are different, i.e.: sizeof(float) == sizeof(int) == sizeof(int *) == sizeof(float *). Is it because of a strict aliasing violation?

ryyker
  • 22,849
  • 3
  • 43
  • 87
  • 1
    We can't comment on strict aliasing without seeing the calling code and the real variable declarations. Strict aliasing has little to do with pointer conversion and everything to do with lvalue access. – Lundin Jan 15 '20 at 15:47
  • @Lundin - I will edit to add... – ryyker Jan 15 '20 at 15:48
  • You have not shown the calls where the arguments are passed, so we do not know how you are converting them to `void *`. In any case, converting between `void *` and `char` or `float` is most likely not what you want. Generally, you would pass the **address** of some object to the routine. So the routine receives a pointer, which has been converted to `void *`. Then the routine should convert that pointer to the right type of **pointer**, so that you again have the address of the object. Then that pointer is dereferenced with `*` to get the value of the object. – Eric Postpischil Jan 15 '20 at 15:48
  • Anyway, the compiler errors & the lack of them can simply be explained by C allowing explicit conversions between integers and pointers, and it allows implicit conversions between void pointers and other object pointers. What happens when you de-reference the result is another story - could be UB because of misalignment, pointer size mismatch, strict aliasing, trap representations and so on. – Lundin Jan 15 '20 at 15:52
  • The different methods do completely different things. – molbdnilo Jan 15 '20 at 15:52
  • @molbdnilo - explain please. _different_ is apparently being accepted as valid from both of my compilers, but the reason I posted this is to determine what/why _different_ is. Why does method work at all? – ryyker Jan 15 '20 at 15:54
  • @Lundin - edit with calling method – ryyker Jan 15 '20 at 15:55
  • @EricPostpischil - calling method has been added to post. – ryyker Jan 15 '20 at 15:56
  • One method converts an address to a number, the other converts whatever is stored at that address to a number. In your example, it is highly unlikely that `(char) a` is `'P'` or `(int) a` is 1024. – molbdnilo Jan 15 '20 at 16:00
  • @molbdnilo - The conversions for `int` and `char` are working and even with `-Wall`, I am getting no compiler indication of a problem. What you are saying though makes perfect sense. I just do not understand _why_ it is working. (suggests UB, but I am not sure.) – ryyker Jan 15 '20 at 16:05
  • 1
    in `switch(type) ... ` you should case using `CHAR`, etc, not the explicit value, 0. – alinsoar Jan 15 '20 at 16:17
  • @alinsoar - Edited. Thanks. In the original post I had only included that function, and not the calling method, so I explicitly used the digits. In my actual code I do use the enumerated values btw :) – ryyker Jan 15 '20 at 16:28
  • 1
    Ok, anyway I think it is quite clear without much ado that in C you cannot convert pointers to other types apart from integer, simply because the representation would be useless and would have really no meaning. On the other hand, even the representation of a pointer as integer on some architectures is counter-intuitive. – alinsoar Jan 15 '20 at 16:31
  • @alinsoar: Another issue is that when converting a pointer to an integer type of the same size, one could either view the operation as converting the pointer into a particular integer type with the same size and representation, then converting that into the target type, or as converting the pointer directly into an integer of the target type with the same representation, and both operations would be equivalent. If one were to convert a pointer to `float`, however, the operations would have different meanings and it would be ambiguous which was intended. – supercat Jan 15 '20 at 17:44

3 Answers3

4

The C standard explicitly allows conversions between pointers and integer types. This is spelled out in section 6.3.2.3 regarding pointer conversions:

5 An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

6 Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

Assuming you cast an integer type to void * when passing it to the function then cast it back to the proper integer type, this can be done provided the implementation allows it. GCC in particular will allow this assuming the integer type in question is at least as big as a void *.

This is why the conversion will work for the char and int cases, however you would need to pass in the values (casted to void *) instead of the addresses.

So for example if you called the function like this:

func4((void *)123, INT);

Then the function can do this:

int val = (int)a;

And val would contain the value 123. But if you called it like this:

int x = 123;
func4(&x, INT);

Then val in the function would contain the address of x in main converted to an integer value.

Casting between a pointer type and a floating point type is explicitly disallowed as per section 6.5.4p4 regarding the cast operator:

A pointer type shall not be converted to any floating type. A floating type shall not be converted to any pointer type.

Of course the safest way to pass values via a void * is to store the value in a variable of the appropriate type, pass its address, then cast the void * in the function back to the proper pointer type. This is guaranteed to work.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • `sizeof(float)` is the same size as `int`, and also `int *`, why can it not be convered using method 1? (i.e. compiler error a1, b1) – ryyker Jan 15 '20 at 15:59
  • 1
    @ryyker Because the standard doesn't explicitly allow for converting a `float` to a pointer, while it does allow it for an integer type. – dbush Jan 15 '20 at 16:03
  • That would explain the compiler error. Can you comment (or edit your post) to address why method 1 even works at all? i.e. in molbdinilo's comment above he says that when using method 1, _ it is highly unlikely that `(char) a` is 'P' or `(int) a` is `1024`. But in my code, they are consistently converted to the expected values. – ryyker Jan 15 '20 at 16:09
  • 1
    @ryyker See my edit. The cast going in has to match the cast going out. – dbush Jan 15 '20 at 16:15
  • @ryyker Casting between a pointer and a floating point type is explicitly not allowed (see edit). – dbush Jan 21 '20 at 02:18
1

At your call sites, you are passing the address of each variable.

func4(&c, CHAR);
func4(&d, INT);
func4(&e, FLT);
func4(&f, DBL);
func4(pG, DBL);

(This is the right thing to do.) Therefore, inside func4, you must use what you are describing as "method 2":

T var1    = (T)a;    // WRONG, for any scalar type T
T var2    = *(T *)a; // CORRECT, for any scalar type T

You only got compile-time errors for floating-point types T because the C standard explicitly allows casts from pointer to integer types. But those casts produce a value that has some [implementation-defined] relation to the address of the variable supplied as an argument, not to its value. For instance,

#include <stdio.h>
int main(void)
{
    char c = 'P';
    printf("%d %d\n", c, (char)&c);
    return 0;
}

is a valid program that prints two numbers. The first number will be 80 unless you're running on an IBM mainframe. The second number is unpredictable. It could also be 80, but if it is, that's an accident, not something to rely on. It may not even be the same number each time you run the program.

I don't know what you mean by "[method 1] seems to work", but if you actually got the same value you passed in, it was purely by accident. Method 2 is what you should be doing.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • _the C standard explicitly allows casts from pointer to integer types. But those casts produce a value that has some [implementation-defined] relation to the address of the variable supplied as an argument, not to its value._. is a explanation of why my `method 1` works. So then it is not _undefined behavior_ that is at work here, but getting the correct result _is_ implementation dependent. Thanks – ryyker Jan 15 '20 at 16:25
  • @rykker Indeed there is no undefined behavior here, but it's not implementation-dependent whether method 1 works. Method 1 _never_ works. If you still think it works, then I think you must be confused about the difference between the _address_ of a variable and the _value_ of a variable. Please explain why you think it works. – zwol Jan 15 '20 at 16:28
  • By demonstration. Each time I run the code for example `func4(&c, CHAR);`, `c` comes back `80`, `&c`, in my case, comes back `0x00DCFEBB`. I think `method 1` is validated by _the C standard explicitly allows casts from pointer to integer types._, is it not? – ryyker Jan 15 '20 at 16:32
  • Hmm. There are several possible explanations. To figure out what's really going on, I'm going to need to see the _complete and unmodified_ program that you ran that told you that "c comes back 80", and step-by-step instructions for how to run it and observe c coming back 80. It might be best to ask a new question specifically for that. – zwol Jan 15 '20 at 16:35
  • For clarity, yes, the C standard allows casts from pointer to integer types, but the _result_ of the cast is [an implementation-defined representation of] the _address_ of the pointee, not the value. – zwol Jan 15 '20 at 16:44
  • Here is a [link to file](https://www.dropbox.com/sh/20oljq224n6b9jz/AADaoj0pc33Jl-QHvtOlOZgGa?dl=0) that is complete and run-able. Note, it has not been cleaned up, so is slightly different than that posted here, but very simple and small. – ryyker Jan 15 '20 at 16:46
  • @ryyker OK. I modified that program slightly: I deleted all of the calls to `func4` but the first one, `func4(&c, CHAR);`, and I added `printf("%d %d\n", c, typ.b);` in their place. I compiled that and ran it 65,536 times in a row. Just like the program in my answer, for me it prints two numbers, the first of which is always 80, and the second of which varies from run to run and was never 80. If you make the same changes, do you see the same behavior? – zwol Jan 15 '20 at 17:26
  • @zwol: Unless an implementation defines `uintptr_t` and/or `intptr_t`, there need not be any integer type large enough to accommodate the integer representation of a pointer; conversion of a pointer directly to an integer type that is too small [which could be all integer types] would invoke UB. There also need not be any integer values other than constant zero that are capable of representing any valid pointer values; doing anything with an invalid pointer produced by an integer-to-pointer cast [which could include all results from such casts] would likewise invoke UB. – supercat Jan 15 '20 at 17:34
  • I will not be able to do that today, but will do it. When I do I will comment here again. I would not be surprised that the second number changes, as it is an address, an it can change. Nor am I surprised that the first number did not change. – ryyker Jan 15 '20 at 17:46
  • @rykker ... If neither of those things surprises you then I don't understand what you meant earlier by "c comes back 80". – zwol Jan 15 '20 at 19:49
  • @supercat Fair point, I forgot about the third sentence of 6.3.2.3p6. It's not terribly important by comparison, though. – zwol Jan 15 '20 at 19:51
  • (Referring to code in this post, not what I linked you in earlier comment.) For `char c = 'P'`, calling ` func(&c, CHAR)`, `c` is always converted properly in the called function. i.e. in the statement: `char cVar1 = (char)a;` `cVar1` is consistently converted to `80`. Sorry for the sloppy wording in the previous comment: _c comes back 80_. – ryyker Jan 15 '20 at 20:02
  • @ryyker How do you know that? What _exactly_ did you do to determine the value of `cVar1` inside `func4`? Be excruciatingly detailed. We're well into "this cannot be happening" territory here, absolutely anything could be relevant. – zwol Jan 15 '20 at 22:15
1

It would seem that recovering a value from a void * function argument should always require method 2, so why does method 1 (seem to) work with char and int types? Is this undefined behavior?

Because C specifically allows conversions between integers and pointers. This is allowed since there can be a need to express absolute addresses as integers, particularly in hardware-related programming. The result may be fine or it may invoke undefined behavior, see details below.

When you need to convert between pointers and integers, you should however always use uintptr_t instead, for well-defined and portable conversions. This type wasn't part of C originally, which is why conversions to other integer types is still allowed.

If method 1 works for char and int, why does it not also work with at least the float type? It's not because their sizes are different, i.e.: sizeof(float) == sizeof(int) == sizeof(int *) == sizeof(float *). Is it because of a strict aliasing violation?

Because floating point types do not have a special case allowed conversion like integer types do. They rather have an explicit rule forbidding casts from pointers to floating point. Since it doesn't make any sense to do such conversions.

Strict aliasing only applies when you do a "lvalue access" of the value stored. You only do that for example here: *(double *)a. You access a the data through a type (double) compatible with the effective type of the object (also double), so this is fine.

(double *)a however, is never accessing the actual data, but just attempts to convert the pointer type to something else. So strict aliasing doesn't apply.

Generally, C allows a whole lot of wild pointer conversions, but you only get in trouble once you start to actually de-reference the data through an incorrect type. It is then you can run into problems with incompetible type, misalignment and strict aliasing.


Details:

  • char c = 'P'; ... char cVar1 = (char)a;.
    Conversion from pointer type to integer type. The result is undefined or implementation-defined 1). No lvalue access of the pointed-at data occurs, strict aliasing does not apply 2).
  • char c = 'P'; ... char cVar2 = *(char *)a;.
    Lvalue access of character through character pointer. Perfectly well-defined 3).
  • int d = 1024; ... int iVar1 = (int)a;.
    Conversion from pointer type to integer type. The result is undefined or implementation-defined 1). No lvalue access of the pointed-at data occurs, strict aliasing does not apply 2).

  • int d = 1024; ... int iVar2 = *(int *)a;
    Lvalue access of int through int pointer. Perfectly well-defined 3).

  • float e = 14.5; ... float fVar1 = (float)a;.
    Conversion from pointer type to float. Non-compatible type conversion, cast operator constraint violation 4).

  • float e = 14.5; ... float fVar2 = *(float *)a;.
    Lvalue access of float through float pointer. Perfectly well-defined 3).

  • double... same as float above.


1) C17 6.3.2.3/6:

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

2) C17 6.5 §6 and §7. See What is the strict aliasing rule?

3) C17 6.3.2.1 Lvalues, arrays, and function designators, and
C17 6.3.2.3/1:

A pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

Also, type is perfectly fine to lvalue access through a (qualified) pointer to type, C17 6.5/7: "a type compatible with the effective type of the object".

4) Not one of the valid pointer conversions listed in C17 6.3.2.3. Constraint violation of C17 6.5.4/4:

A pointer type shall not be converted to any floating type. A floating type shall not be converted to any pointer type.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • _...They rather have an explicit rule forbidding casts from pointers to floating point._. Nice, Thanks – ryyker Jan 15 '20 at 16:37
  • The C Standard does not mandate that implementations be capable of meaningfully processing any conversions between integers and pointers in any circumstances other than the conversion of a Null Pointer Constant. Because the authors of the Standard were unwilling to have implementations or accept or reject syntactic constructs based upon their support for the semantics implied thereby, even implementations where no conversions could possibly be meaningful are required to accept code that includes them, provided such code is never executed. – supercat Jan 15 '20 at 17:31
  • @supercat Yes, `uintptr_t` is optional. Fortunately, 99.99% of all real-world computers use numeric addressing with adjacent addresses, so all that matters in the world where C programs are executed is address bus width versus integer size. I'll make a note that ISO JTC1/SC22/WG14 has a very deep fascination for parallel realities though. – Lundin Jan 16 '20 at 07:47
  • @Lundin: Unfortunately, compilers are allowed to implement `uintptr_t` without guaranteeing semantics any stronger than the fact that converting a pointer to a number and then converting that to a pointer will yield a pointer that *compares equal* to the original. The Standard does not require that given `T *p; uintptr_t uip;`, if `(uintptr_t)p` is observed, and `uip` happens to have the same value, then `(T*)uip` may be used to access the same storage. Neither clang nor gcc reliably upholds that principle. Indeed, in clang it's possible for the conversion of two pointers... – supercat Jan 16 '20 at 12:26
  • ...to `uintptr_t` and subsequent comparison between them to render one of the pointers incapable of accessing the objects it could otherwise access. – supercat Jan 16 '20 at 12:32
  • @supercat It's a quality of implementation issue regardless of what the standard says. I guess this is how it goes if there are only PC programmers in the committee + the open source compilers. Because if you can't convert to uintptr_t and back, the implementation is broken and cannot be used for hardware-related programming, embedded systems etc. There will always be a need to iterate over adjacent addresses when writing things like flash programmers, CRC, "walking 1" memory checks etc. This can't be done with pointer arithmetic, since that only works for arrays of the same object. – Lundin Jan 16 '20 at 15:10
  • @Lundin: Unfortunately, the authors of clang and gcc take the view that programmers shouldn't expect anything beyond what the Standard mandates, while the Standards Committee takes the view that the marketplace should be better positioned than the Committee to just QoI issues. That would be fine if the language were being steered by compiler writers who were motivated by a desire to sell compilers rather than demonstrate cleverness. – supercat Jan 16 '20 at 15:47