1

Let us consider the following piece of code:

#include <stdio.h>
int main()
{
    int v1, v2, *p;
    p = &v1;
    v2 = &v1;
    printf("%d\t%d\n",p,v2);
    printf("%d\t%d\n",sizeof(v2),sizeof(p));
    return 0;
}

We can see, as expected, that the v2 variable (int) occupies 4 bytes and that the p variable (int pointer) occupies 8 bytes.

So, if a pointer occupies more than 4 bytes of memory, why we can store its content in an int variable?

In the underlying implementation, does the pointer variables store only the memory address of another variable, or it stores something else?

tadman
  • 208,517
  • 23
  • 234
  • 262
Zaratruta
  • 2,097
  • 2
  • 20
  • 26
  • 3
    "*why we can store its content in an int variable*". It can't. You are losing the higher order 4 bytes. (assuming 64 bit addresses on your system as implied by your description). Those bytes may not always be 0 so just getting those chopped off is a problem. – kaylum Aug 15 '22 at 22:11
  • 4
    `%d` is not the correct format for a pointer. Use `%p` instead. Turning up the compiler warnings can help you avoid incorrect behaviour. https://godbolt.org/z/T3rev9jYo – Retired Ninja Aug 15 '22 at 22:11
  • "So, if a pointer occupies more than 4 bytes of memory, why we can store its content in an int variable?": we cannot. p is not an integer it's a pointer on integer – Jean-François Fabre Aug 15 '22 at 22:12
  • `sizeof(int) == sizeof(int*)`? My guess: No. On a modern platform `sizeof(int) == 4` and `sizeof(int*) == 8`. – tadman Aug 15 '22 at 22:12
  • 1
    I think you're confusing "bits" and "bytes" here. – tadman Aug 15 '22 at 22:13
  • @tadman Sorry. I would like to say bytes. I have written it wrongly. – Zaratruta Aug 15 '22 at 22:17
  • @tadman Yes. I said that in the post. sizeof(int) == 4 and sizeof(int*) == 8 – Zaratruta Aug 15 '22 at 22:18
  • You did say bits, so I wasn't sure if you were using part of the pointer itself for some mysterious purpose in the end. If `int` is smaller than `int*` you can't possibly fit a pointer in an `int`. It will be truncated. You could store a pointer in [`uint64_t`](https://en.cppreference.com/w/c/types/integer), though, that's just a matter of casting. I'm not sure why you'd want to, as that's pretty useless. You can always use the contents of a pointer more directly. – tadman Aug 15 '22 at 22:18
  • @tadman I think that's really useless. It's just a matter of curiosity. – Zaratruta Aug 15 '22 at 22:22
  • It's not useless from an understanding perspective, but in practical programming it really is a waste of time. Get more comfortable using pointers directly. – tadman Aug 15 '22 at 22:23
  • @tadman I do that. As I said, it's just a matter o curiosity. A friend of mine asked this and I was not able to provide a good answer. – Zaratruta Aug 15 '22 at 22:24
  • 1
    It's not much different from storing an `int` variable in a `char`. It's allowed, but if the original value is too large you'll lose part of it. – Barmar Aug 15 '22 at 22:26
  • Duplicate: ["Pointer from integer/integer from pointer without a cast" issues](https://stackoverflow.com/questions/52186834/pointer-from-integer-integer-from-pointer-without-a-cast-issues) – Lundin Aug 16 '22 at 06:43
  • It should be noted that C was never intended to be a safe-guarding high level language. However you'll get warnings here (normally). You should rather enjoy seeing a pointer as int, where in another language the code would not compile. You also maybe should not write software for a nuclear plant in C, – Joop Eggen Aug 16 '22 at 11:18

4 Answers4

2

There is always a warning, see below.

main.c: In function ‘main’:
main.c:6:8: warning: assignment to ‘int’ from ‘int *’ makes integer from pointer without a cast [-Wint-conversion]
     v2 = &v1;

main.c: In function ‘main’:
main.c:6:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
     v2 = (int) &v1;

In the first case, just setting an integer value to a pointer value is not appropriate, because it is not a compatible type.

In the second case, with a cast of the pointer to an integer, the compiler recognizes the problem of the different sizes, which means v2 can not completely hold (int) &v1;

Conclusion: Both cases are "bad" in terms of creating an undesired behaviour.

About your question "So, if a pointer occupies more than 4 bytes of memory, why we can store its content in an int variable?" - It can NOT completely be stored in an int variable.

About your question "In the underlying implementation, does the pointer variables store only the memory address of another variable, or it stores something else?" - A pointer just points to an address. (It could be the address of another variable or not. It does not matter. It just points to an address.)

sidcoder
  • 460
  • 2
  • 6
2

We can see, as expected, that the v2 variable (int) occupies 4 bytes and that the p variable (int pointer) occupies 8 bytes.

I'm not sure what exactly the source of your expectation is there. The C language does not specify the sizes of ints or pointers. Its requirements on the range of representable values of type int afford int size as small as two 8-bit bytes, and historically, that was once a relatively common size for int. Some implementations these days have larger ints (and maybe also larger char, which is the unit of measure for sizeof!).

I suppose that your point here is that in the implementation tested, the size of int is smaller than the size of int *. Fair enough.

So, if a pointer occupies more than 4 bytes of memory, why we can store its content in an int variable?

Who says the code stores the pointer's (entire) content in the int? It converts the pointer to an int,* but that does not imply that the result contains enough information to recover the original pointer value.

Exactly the same applies to converting a double to an int or an int to an unsigned char (for example). Those assignments are allowed without explicit type conversion, but they are not necessarily value-preserving.

Perhaps your confusion is reflected in the word "content". Assignment does not store the representation of the right-hand side to the left-hand object. It converts the value, if necessary, to the target object's type, and stores the result.

In the underlying implementation, does the pointer variables store only the memory address of another variable, or it stores something else?

Implementations can and have varied, and so too the meaning of "address" for different machines. But most commonly these days, pointers are represented as binary numbers designating locations in a flat address space.

But that's not really relevant. C specifies that pointers can be converted to integers and vice versa. It also provides integer types intptr_t and uintptr_t (in stdint.h) that support full-fidelity round trip void * to integer to void * conversion. Pointer representation is irrelevant to all that. It is the implementation's responsibility to implement the types and conversions involved so that they behave as required, and there is more than one way to do that.


*C actually requires an explicit conversion -- that is, a typecast -- between pointers and integer. The language specification does not define the meaning of the cast-less assignment in the example code, but some compilers do accept that and perform the needed conversion implicitly. My remarks assume such an implementation.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
1

The key to understanding what's going on here is that C is an abstraction layer on top of the underlying ISA. Most architectures have little more than registers and memory addresses1 to work with, all of which are of a fixed size. When manipulating "variables", you're really just expressing your intent which the compiler translates into more concrete instructions.

On x86_64, a common architecture, an int is in actuality either a portion of a 64-bit register, or it's a 4-byte location in memory that's aligned on a 4-byte boundary. An int* is a 64-bit value, or 8-byte location in memory with corresponding alignment constraints.

Putting an int* value into a suitably sized variable, such as uint64_t, is allowed. Putting that value back into a pointer and exercising that pointer may not be permitted, it depends on your architecture.

From the programmer's perspective a pointer is just 64 bits of data. From the CPU's perspective it may contain more than that, with modern architectures having things like internal "Pointer Authentication Codes" (PACs) that ensure pointers cannot be injected from external sources. It gets quite a bit more complicated under the hood.

In general it's best to treat pointers as opaque, that is their actual value is as good as random and irrelevant to the internal operation of your program. It's only when you're doing deeper analysis at the architectural level with sufficiently robust profiling tools that the actual internals of the pointer can be informative or relevant.

There are several well-defined operations you can do on pointers, like p[n] to access specific offsets within the bounds of a structure or allocation, but outside of that you're pretty limited in what you can do, or even infer. Remember that modern CPUs and operating systems use virtual memory, so pointer addresses are "fake" and don't represent where they are in physical memory. In fact, they're deliberately scrambled to make them harder to guess.


1 This disregards VLIW, SIMD, and other extensions which are not so simple.

tadman
  • 208,517
  • 23
  • 234
  • 262
-1

So, if a pointer occupies more than 4 bytes of memory, why we can store its content in an int variable?

You cannot, indeed the code you post is not legal.


#include <stdio.h>
int main()
{
    int v1, v2, *p;

this declares to int variables and a pointer to int called p.

    p = &v1;

this is legal, as you assign to p the address of the integer variable v1.

    v2 = &v1;  /* INCORRECT!!! */

this is not. It assigns to an int variable the address of another variable (which is a pointer, and as you well say, it is not possible) The most probable intention of the code writer was:

    v2 = *p;

which assigns to v2 the integer value stored at address pointed to by p (which is pointing to v1, so it assigns v2 the value stored in v1.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31