How do pointers reference multi-byte variables?

Question

I am confused as to how C pointers actually reference the memory address of a variable. I am probably missing something here, but if, for example an int is 32 bits (like in C), then this would be stored in 4 bytes.

If I am not mistaken then each memory address tends to be a byte in size, as these are generally the smallest units of addressable memory. So if an int takes up 4 bytes, then wouldn't it have 4 memory addresses? (as it is stored over 4 8-bit memory addresses).

If this is the case, then how come a pointer only holds one memory address? (or rather only displays one when printed, if it holds more?). Is this simply the first address that stores the int? (assuming they are stored contiguously).

I have tried to find answers online but this has only led to further confusion.

Essentially, what you're missing is that an `int` is 4 *consecutive* bytes, not just *any* four bytes. (What you mention here is, by the way, the key to the difference between many languages' list and array types, e.g. Python's `list` and `np.ndarray`.) — Linuxios, Jul 22 '19 at 21:40
Yes, an int takes up 4 memory addresses, but they are consecutive. That is, if the address of an int is, say, 5000, then the bytes that make up that int are at addresses 5000, 5001, 5002, and 5003. When you do something like `x = *ip`, it copies four consecutive bytes from the pointer address to the variable's address. — Lee Daniel Crocker, Jul 22 '19 at 21:40

ShadowRanger · Accepted Answer · 2019-07-22T21:39:11.847

Yes, technically, there would be four addressable bytes for the int you describe. But the pointer points to the first byte, and reading an int from it reads that byte and the subsequent three bytes to construct the int value.

If you tried to read from a pointer referring to one of the other three bytes, at the very least you'd get a different value (because it would read the remains of the one int, and additional bytes next to it), and on some architectures which require aligned reads (so four byte values must begin at an address divisible by four), your program could crash.

The language tries to protect you from reading a misaligned pointer like that; if you have an int*, and add 1 to it, it doesn't increment the raw address by one, it increments it by sizeof(int) (for your case, 4), so that pointers to arrays of int can traverse the array value by value without accidentally reading a value that's logically partially from one int, and partially from its neighbor.

I see. So does this mean that a variable is always stored contiguously? (at least in C). — Jr795, Jul 22 '19 at 21:38
@Jr795: I've never seen an architecture that doesn't store it contiguously, and I'm pretty sure parts of the C standard library imply contiguous storage by the nature of their API (e.g. `memcpy` couldn't work on arrays of `int`s unless `int`s were guaranteed to consist of `sizeof(int)` contiguous bytes). The *order* of the bytes isn't guaranteed (read up on [endianness](https://en.wikipedia.org/wiki/Endianness) if you're interested), but it's always stored contiguously. — ShadowRanger, Jul 22 '19 at 21:41

score 2 · Answer 2 · answered Jul 22 '19 at 21:36

Pointer points to the starting address of your type, if you google "pointer size" it will show you it is generally dependent to your cpu architecture, not to your primitive type or object.

What is the size of a pointer?

which will hopefully support your thoughts although the question is about c++

score 1 · Answer 3 · answered Jul 22 '19 at 21:37

1

One byte is the smallest addressable unit, but that doesn't mean an address is only one byte. Otherwise you'd only have 256 bytes you could address! Pointers are typically either 4 or 8 bytes on size

The address of a variable refers to the address of it's first byte. The remaining bytes are understood to immediately follow those, and the number of bytes are part of the datatype.

answered Jul 22 '19 at 21:37

dbush

205,898
23
218
273

Regarding your first paragraph: I don't *think* the OP was saying they thought a pointer was only one byte in size, only that they thought a pointer referenced a specific byte (which is true, but reading it reads enough bytes to satisfy the type in question). – ShadowRanger Jul 22 '19 at 21:44

score 0 · Answer 4 · answered Jul 22 '19 at 21:54

The specifics depend on the actual architecture of the machine (what kind a CPU, what kind of memory, etc) so I am assuming you care about a modern 32 bit processor where both an int and a pointer take four bytes of memory. Keep in mind that the same ideas apply when integers are two bytes and when pointers are 8 bytes but we have to focus on just one set of examples.

Having said all that, you are completely correct that an int uses four contiguous bytes of memory which means it has four separate memory addresses and that a pointer holds only one address - it is the address of the first byte of the int.

So the CPU has an instruction for reading an int. The instruction takes the address of the first byte of the int and reads an entire int - all four of them. And that's why you only need one address to read an entire int. So int i = 42 reads a four byte integer into i and the value is interpreted to mean the number 42.

But a pointer is also an integer where the value is a memory address, so it can be read exactly the same way. So int *p = 42 reads a four byte integer into p and the value is interpreted to mean memory address 42.

All this gets complicated when you start taking about the order the bytes are stored in so we won't talk about that (however, if you want to find out the term is endianness - see https://en.wikipedia.org/wiki/Endianness)

How do pointers reference multi-byte variables?

4 Answers4