downcasting pointer to int in c++ [ why the compiler does not allow (int)(void) while allows (int)(long)(void) ] Which is the best way to express it

Question

I had some old code that worked like a charm, "suddenly" strange errors appeared. I can reproduce the problem with the code bellow:

int main(int argc, char **argv)
{ void* p=0;
    int b=(int)p; // line1: this produces compile error at least for gcc std=c++23
    int a=(int)(long)p; // line2: this line emits a warning only
    return 0;
}

The error is: "error: cast from 'void*' to 'int' loses precision [-fpermissive]"

As I have to "downcast" the pointer p to an integer ( in reality in my case the correct downcasting should be from size_t to uint or from ptrdiff_t to int ).

I don't want to switch off warnings, so I found the "trick" to convert at first to a long (or more correctly should be a ptrdiff_t and afterwards convert it again to an int). In such a way I bypass "frontier controls".

BUT SEMANTICALLY both cases are interpreted by me as as being the same and should compile equivalently.

So I am asking what should be done to implement the code in the more standard and clean way ?

Isn't there a gap in logic ? For the story i compile using g++ and -m64.

The question is mostly theoretical by the way.

An `int` may not be big enough to hold pointer values. Why do you want to do this? Compare `sizeof(void*)` with `sizeof(int)` and think about it. How's the pointer value going to fit? — Ted Lyngmo, Aug 16 '23 at 20:31
On your system `int` is probably 32-bit and pointers are 64-bit. — Richard Critten, Aug 16 '23 at 20:31
You're compiling for a 64 bit platform, so pointers don't fit in ints. Both line1 and line2 are equally wrong, even if you hide the error with the extra cast in line2. — interjay, Aug 16 '23 at 20:31
Use [`std::intptr_t`](https://en.cppreference.com/w/cpp/types/integer) if you need a signed integer that's large enough to store a pointer. That's what it's there for — Brian61354270, Aug 16 '23 at 20:31
This code has implementation-defined behavior. It did when it "worked like a charm", and still does. — Drew Dormann, Aug 16 '23 at 20:33
Remember that `long` can be confusing in this context because on most platforms it switches between 32 and 64 bit with the address space but on Windows it is always 32 bit — Homer512, Aug 16 '23 at 20:34
The int that is downcasted from the pointer as an example, is an index to a table that holds less than 2^32 elements. So the facts about void* being 64bits and int being 32 are known things. So the question is not why I need to to this, but why a double conversion does bypass the problem while a single one does not, and if that is acceptable or not. As indicated in my text, the full path should be to differentiate between two non void addresses obtain a ptrdiff_t and then convert that to an int. — George Kourtis, Aug 16 '23 at 20:38
@GeorgeKourtis c-style casts tell the compiler that you know best and even if the code has UB you want this to happen. So all the cast does is suppress a warning that something bad is going to happen. Note this c-style casts are equivalent to `reinterpret_cast`. — Richard Critten, Aug 16 '23 at 20:39
@RichardCritten This is what I was supposing until now, but it seems that somewere in the time path things have changed and even if I cast (int)(void*) the compiler does not allow me to do it !!! — George Kourtis, Aug 16 '23 at 20:42
Many people think of pointer values as a kind of number. They are often *represented* that way, to be sure, and pointers can be *converted* to numbers. This kind of thinking is necessary to get to the idea that the two flagged lines are semantically the same. But semantically, pointers **are not** numbers, and much confusion and grief can be averted by training yourself not to think of them that way. — John Bollinger, Aug 16 '23 at 20:44
@JohnBollinger It depends on the level you work at, when coping with cpu registers, you have only numbers, and semantics are a different thing that grows later on depending on level of abstraction you work at. So yes C pointers are not numbers especially when typed as pointers to specific structures. — George Kourtis, Aug 16 '23 at 20:48
@GeorgeKourtis the point it that it was never well defined code and now, lucky, the compiler has pointed it out. — Richard Critten, Aug 16 '23 at 20:56
@RichardCritten Probably the answer is that I should at first create a ptrdiff_t making the difference of two pointers and thus gedding the index from it that is an integer of some size. Afterwards downgrading that integer to the int that in our case happens to be a 32bit value. And this is allowed by the compiler. — George Kourtis, Aug 16 '23 at 20:59
For what purpose are you casting a pointer to an `int` (or `std::size_t` or whatever)? A `ptrdiff_t` has constraints on it (in particular, has to be elements in the same array), which may-or-may-not be abided by in your actual code. — Eljay, Aug 16 '23 at 21:03
@Eljay If my array is less than 2^32 elements, than I dont need indexes having size of 64 bits, but just 32. Such an array having an element size of e.g. 32bytes may easily consume 32*4GB=128GB of memory ( it happens I have just 80GB on my PC ). — George Kourtis, Aug 16 '23 at 21:04
Having 4 giga-elements of 32-bytes each isn't going to be any less memory consuming merely because you are using 32-bit `int` indexes. — Eljay, Aug 16 '23 at 21:11
@Eljay If the indexes are stored exactly inside the elements of 32 bytes, referring to other elements in the same group, you may easily understand that I will need instead of 32bytes for each element, just 64bytes and in the extreme case instead of needing 128GB I will need 256GB, and cause of that when doing garbage collection I will need twice time just because I am accessing twice the memory while every odd 32bit unit in my memory will contain always 0. — George Kourtis, Aug 16 '23 at 21:19
@Eljay Just for the story, when I will have more physical memory in my machine (and more needs), I will expand then index from 4 bytes to 5 bytes, spanning an address space of 2^40 different elements. Supposing them having a size of 32bytes each I may fill up to 2^45 bytes of memory. 32TB of memory thus. This is near to the physical limit that actual processors have ( 48bits -> 256 TB ). — George Kourtis, Aug 16 '23 at 21:27
You seem to be assuming your memory will start at 0. You may not be able to guarantee this. — user4581301, Aug 16 '23 at 21:43
I am speaking of "indexes" not of addresses, so memory could start wherever it likes, by the way regarding my question I just found the "correct code" and it is: unsigned int a=(uintptr_t)p; — George Kourtis, Aug 16 '23 at 21:52
I find it hard to see the value of saving 4 bytes for having an index in an `int` rather than a `std::size_t` for an array that that is 32 TB. This is the XY-iest scenario I've seen in a while. — Eljay, Aug 16 '23 at 22:20
Yes, both your cases are semantically equivalent. And both are equally bad practice. The fact one is allowed by your compiler and the other is not is .... unfortunate .... but not unexpected, since your code is in the realm of unspecified or (depending on what else you are doing) implementation-defined behaviour. If you *must* convert a pointer to an integral type, probably better to use `std:intptr_t` or similar rather than basic types (`int`, `long`, etc). And take steps to ensure you reverify that code thoroughly whenever it is built by a different compiler (or compiler version). — Peter, Aug 17 '23 at 02:06
Keep in mind that on a 64 bit platform, the start location of your array can be anywhere in that 64 bit address space (actually for x86-64, anywhere in 48 bit). Even if the total size of the array is smaller. If you just want to store the offset from the start of the array as `int`, that is fine. But that's not what you are doing. You just invite random undefined behavior depending entirely on where the kernel chose to give you virtual memory — Homer512, Aug 17 '23 at 12:43
@Homer512 You are correct, it was any way code of 10 years ago, and I didn't had clear ideas as I have little bit more now. The correct way to go is to speak about indexes and not addresses. — George Kourtis, Aug 17 '23 at 13:39
BTW: A lot of the weirdness of pointers can be understood by C/C++ trying to support platforms without flat memory models. Imagine segmented memory like i286. Then a pointer might be an opaque structure of one int-sized segment selector and one int-sized offset within the segment. `intptr_t` would encode these two as one integer but the bits would have platform-specific meaning. `size_t` would be smaller than `intptr_t` because it only needs to encode the offset. And doing pointer arithmetic with pointers from different allocations is meaningless since you cannot subtract segment selectors — Homer512, Aug 17 '23 at 13:54
@Homer512 • or banked switched memory of a 65816. Good times! — Eljay, Aug 17 '23 at 14:12

downcasting pointer to int in c++ [ why the compiler does not allow (int)(void*) while allows (int)(long)(void*) ] Which is the best way to express it

0 Answers0

downcasting pointer to int in c++ [ why the compiler does not allow (int)(void) while allows (int)(long)(void) ] Which is the best way to express it