6

I'm learning C and currently learn about pointers. I understand the principle of storing the address of a byte in memory as a variable, which makes it possible to get the byte from memory and write to the memory address.

However, I don't understand where the address of a pointer is stored. Let's say the value of a pointer (the address of a byte in memory) is stored somewhere in memory - how can the program know where the pointer is stored? Wouldn't that need a pointer for a pointer resulting in endless pointers for pointers for pointers... ?


UPDATE

The actual question is: "How does the compiler assign memory addresses to variables". And I found this question which points out this topic.

Thanks to everybody who's answered.

Community
  • 1
  • 1
MinecraftShamrock
  • 3,504
  • 2
  • 25
  • 44
  • 1
    The compiler knows where the pointer is the same way it knows where any other variable is. How it knows where any variables are is a longer story. – user2357112 Aug 08 '14 at 08:15
  • a good question is not the same as a question that you have to be an expert to answer. can downvoters please leave a comment on how the question can be improved. – sp2danny Aug 08 '14 at 08:24
  • 2
    "how does the program know" is not the same as "how does the compiler know". A pointer typed variable in C is just like any other type of variable. It has an address. The compiler does **not** (need to) create a separate pointer **inside your code** for it. How it knows its address -- in first place, it may not even know it at compile-time (a lot of stuff happens at link and runtime that affect its actual address), but even if it knows -- it will be represented by a data structure **inside the compiler,** and definitely not inside your own program. – The Paramagnetic Croissant Aug 08 '14 at 08:26
  • Alright. So the actual question is "How does the compiler know where the variables are stored at runtime?" – MinecraftShamrock Aug 08 '14 at 09:00
  • Oh, and @user2357112 I'd love to hear this longer story :) – MinecraftShamrock Aug 08 '14 at 09:03
  • 1
    Or better: "How does the compiler know which memory address is assigned to which variable name at runtime?" - Also because this should be dynamically assigned at runtime, shouldn't it? – MinecraftShamrock Aug 08 '14 at 09:08
  • It really is turtles all the way down. – Hot Licks Aug 08 '14 at 16:29

5 Answers5

6

This is an implementation detail, but...

Not all addresses are stored in memory. The processor also has registers, which can be used to store addresses. There are only a handful of registers which can be used this way, maybe 16 or 32, compared to the billions of bytes you can store in memory.

Variables in registers

Some variables will get stored in registers. If you need to quickly add up some numbers, for example, the compiler might use, e.g., %eax (which is a register on x86) to accumulate the result. If optimizations are enabled, it is quite common for variables to exist only in registers. Of course, only a few variables can be in registers at any given time, so most variables will need to get written to memory at some point.

If a variable is saved to memory because there aren't enough registers, it is called "spilling". Compilers work very hard to avoid register spilling.

int func()
{
    int x = 3;
    return x;
    // x will probably just be stored in %eax, instead of memory
}

Variables on the stack

Commonly, one register points to a special region called the "stack". So a pointer used by a function may be stored on the stack, and the address of that pointer can be calculated by doing pointer arithmetic on the stack pointer. The stack pointer doesn't have an address because it's a register, and registers don't have addresses.

void func()
{
    int x = 3; // address could be "stack pointer + 8" or something like that
}

The compiler chooses the layout of the stack, giving each function a "stack frame" large enough to hold all of that function's variables. If optimization is disabled, variables will usually each get their own slot in the stack frame. With optimization enabled, slots will be reused, shared, or optimized out altogether.

Variables at fixed addresses

Another alternative is to store data at a fixed location, e.g., "address 100".

// global variable... could be stored at a fixed location, such as address 100
int x = 3;

int get_x()
{
    return x; // returns the contents of address 100
}

This is actually not uncommon. Remember, that "address 100" doesn't correspond to RAM, necessarily—it is actually a virtual address referring to part of your program's virtual address space. Virtual memory allows multiple programs to all use "address 100", and that address will correspond to a different chunk of physical memory in each running program.

Absolute addresses can also be used on systems without virtual memory, or for programs which don't use virtual memory: bootloaders, operating system kernels, and software for embedded systems may use fixed addresses without virtual memory.

An absolute address is specified by the compiler by putting a "hole" in the machine code, called a relocation.

int get_x()
{
    return x; // returns the contents of address ???
              // Relocation: please put the address of "x" here
}

The linker then chooses the address for x, and places the address in the machine code for get_x().

Variables relative to the program counter

Yet another alternative is to store data at a location relative to the code that's being executed.

// global variable... could be stored at address 100
int x = 3;

int get_x()
{
    // this instruction might appear at address 75
    return x; // returns the contents of this address + 25
}

Shared libraries almost always use this technique, which allows the shared library to be loaded at whatever address is available in a program's address space. Unlike programs, shared libraries can't pick their address, because another shared library might pick the same address. Programs can also use this technique, and this is called a "position-independent executable". Programs will be position-independent on systems which lack virtual memory, or to provide additional security on systems with virtual memory, since it makes it harder to write shell code.

Just like with absolute addresses, the compiler will put a "hole" in the machine code and ask the linker to fill it in.

int get_x()
{
    return x; // return the contents of here + ???
              // Relocation: put the relative address of x here
}
Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • Storing data at a fixed location would not make sence since exactly this fixed address might not be available at runtime, would it? – MinecraftShamrock Aug 08 '14 at 09:06
  • I think one concept you're missing is that the compiler doesn't "think" in terms of exact memory addresses, it "thinks" in terms of offsets. The OS does the real layout, saying "you can have this chunk of RAM", and the program executes with all of its memory addresses adding to whatever initial location is given to it. See e.g. http://stackoverflow.com/questions/19101449/how-does-compiler-lay-out-code-in-memory?rq=1 and all of its related questions. – Scott Mermelstein Aug 08 '14 at 09:20
  • What determinates the size of the block of RAM that is needed? Can this block be expanded as needed? And does this "block of memory" refer to the heap or stack? – MinecraftShamrock Aug 08 '14 at 09:25
  • @MinecraftShamrock: Virtual memory means that *all* addresses are available at runtime, under the right circumstances. Fixed addresses are actually not uncommon! – Dietrich Epp Aug 08 '14 at 15:43
  • @ScottMermelstein: Not true! The compiler (well, the linker, actually) is free to use absolute addresses for global variables. If PIC is enabled, the address will be PC-relative, but often PIC is *not* enabled, which means that the linker will choose absolute addresses. In other words, the program will say, "Give me a chunk of RAM at addresses 0x4800 through 0x7A00" and the OS will happily comply... this is possible because of virtual memory: multiple programs can use the same range of addresses. – Dietrich Epp Aug 08 '14 at 15:47
  • @ScottMermelstein: The process by which the compiler specifies an absolute address is somewhat roundabout: the global variable will have an absolute address, and the compiler will specify the address as part of the "relocation table" of a *relocatable object*, which is the `*.o` or `*.obj` file produced by the compiler. Think of it like "this is an absolute address, but I don't know exactly what that address is yet". The linker will then choose the exact address, and modify the machine code so it refers to the address. The OS (kernel) doesn't get a choice! – Dietrich Epp Aug 08 '14 at 15:53
  • @DietrichEpp you say `Yet another alternative is to store data at a location relative to the code that's being executed.` How can I know via linux command line (bash ubuntu) the start location address of the program being executed? Also how can I know the end location of the program being executed (I suppose start address + program byte dimensions)? – Jacquelyn.Marquardt Nov 29 '16 at 20:52
  • @f126ck: This is kind of its own question. First of all, Bash is just a shell—it does a little scripting and it runs programs, but it's not really relevant here. A program isn't a flat chunk of data that gets loaded into memory. An ELF executable on Linux contains segments, which are contiguous chunks of memory with certain permissions and contents. You can see these by running `readelf --headers` on your program. The program code is typically stored in a section (not segment) named `.text`, and you can read the address from there. – Dietrich Epp Nov 29 '16 at 23:05
  • … However, the `.text` section is not the only section in the program. Furthermore, programs can be made position-independent, in which case the actual address at runtime will be different, possibly randomized by ASLR. Also note that these addresses are only valid within the program's address space. – Dietrich Epp Nov 29 '16 at 23:07
2

The pointer is just a variable. The only difference between this and, e.g. a long variable is that we know that what is stored in a pointer variable is a memory address instead of an integer.

Therefore, you can find the address of a pointer variable by the same way as you can find the address of any other variable. If you store this address in some other variable, this one will also have an address, of course.

You confusion seems to originate from the fact that the pointer (i.e. a variable address) can in its turn be stored. But it does not have to be stored anywhere (you only do it when you for some reason need this address). From the point of view of your program, any variable is more or less a named memory location. So the "pointer to the variable" is a named memory location that contains the value that is supposed to "point" to another memory location, hence the name "pointer".

Ashalynd
  • 12,363
  • 2
  • 34
  • 37
2

A variable that is a pointer is still a variable, and acts like any other variable. The compiler knows where the variable is located and how to access its value. It is just that the value happens to be a memory address, that's all.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
1

Let's say the value of a pointer (the address of a byte in memory) is stored somewhere in memory

The address of a byte that you allocated, say like this

char ch = 'a';

is referenced by the compiler in the symbol table with the right offset. At run time, the instructions generated by the compiler will use this offset for moving it to from the primary memory to a register for some operation on it.

A pointer, in the sense you're asking, is not stored anywhere, it's just a type when you refer to a variable's address, unless you explicitly create a pointer variable to store it like this

&ch;             // address of ch not stored anywhere
char *p = &ch;   // now the address of ch is stored in p

Thus there's no recursion concept here.

legends2k
  • 31,634
  • 25
  • 118
  • 222
0

From the compilers perspective, whether u declare a pointer or a general variable is just a memory space.When you declare a variable a certain block of memory is allocated to the variable.

The variable can be any either a general variable or a pointer. So ultimately we have a variables (even pointers are variables only) and they have a memory location.

Ginu Jacob
  • 1,588
  • 2
  • 19
  • 35