Long Winded Explanation of Pointers
When explaining what pointers are to people who already know how to program, I find that it's really easy to introduce them using array terminology.
Below all abstraction, your computer's memory is really just a big array, which we will call mem
. mem[0]
is the first byte in memory, mem[1]
is the second, and so forth.
When your program is running, almost all variables are stored in memory somewhere. The way variables are seen in code is pretty simple. Your CPU knows a number which is an index in mem
(which I'll call base
) where your program's data is, and the actual code just refers to variables using base
and an offset.
For a hypothetical bit of code, let's look at this:
byte foo(byte a, byte b){
byte c = a + b;
return c;
}
A naive but good example of what this actually ends up looking like after compiling is something along the lines of:
- Move
base
to make room for three new bytes
- Set
mem[base+0]
(variable a) to the value of a
- Set
mem[base+1]
(variable b) to the value of b
- Set
mem[base+2]
(variable c) to the sum mem[base+0] + mem[base+1]
- Set the return value to
mem[base+2]
- Move
base
back to where it was before calling the function
The exact details of what happens is platform and convention specific, but will generally look like that without any optimizations.
As the example illustrates, the notion of a
b
and c
being special entities kind of goes out the window. The compiler calculates what offset to give the variables when generating relevant code, but the end result just deals with base
and hard-coded offsets.
What is a pointer?
A pointer is just a fancy way to refer to an index within the mem
array. In fact, a pointer is really just a number. That's all it is; C just gives you some syntax to make it a little more obvious that it's supposed to be an index in the mem
array rather than some arbitrary number.
What a does referencing and dereferencing mean?
When you reference a variable (like &var
) the compiler retrieves the offset it calculated for the variable, and then emits some code that roughly means "Return the sum of base
and the variable's offset"
Here's another bit of code:
void foo(byte a){
byte bar = a;
byte *ptr = &bar;
}
(Yes, it doesn't do anything, but it's for illustration of basic concepts)
This roughly translates to:
- Move
base
to make room for two bytes and a pointer
- Set
mem[base+0]
(variable a) to the value of a
- Set
mem[base+1]
(variable bar) to the value of mem[base+0]
- Set
mem[base+2]
(variable ptr) to the value of base+1
(since 1 was the offset used for bar)
- Move
base
back to where it had been earlier
In this example you can see that when you reference a variable, the compiler just uses the memory index as the value, rather than the value found in mem
at that index.
Now, when you dereference a pointer (like *ptr
) the compiler uses the value stored in the pointer as the index in mem
. Example:
void foo(byte* a){
byte value = *a;
}
Explanation:
- Move
base
to make room for a pointer and a byte
- Set
mem[base+0]
(variable a) to the value of a
- Set
mem[base+1]
(variable value) to mem[mem[base+0]]
- Move
base
back to where it started
In this example, the compiler uses the value in memory where the index of that value is specified by another value in memory. This can go as deep as you want, but usually only ever goes one or two levels deep.
A few notes
Since referenced variables are really just numbers, you can't reference a reference or assign a value to a reference, since base+offset
is the value we get from the first reference, which is not stored in memory, and thus we cannot get the location where that is stored in memory. (&var = value;
and &&var
are illegal statements). However, you can dereference a reference, but that just puts you back where you started (*&var
is legal).
On the flipside, since a dereferenced variable is a value in memory, you can reference a dereferenced value, dereference a dereferenced value, and assign data to a dereferenced variable. (*var = value;
, &*var
, and **var
are all legal statements.)
Also, not all types are one byte large, but I simplified the examples to make it a bit more easy to grasp. In reality, a pointer would occupy several bytes in memory on most machines, but I kept it at one byte to avoid confusing the issue. The general principle is the same.
Summed up
- Memory is just a big array I'm calling
mem
.
- Each variable is stored in memory at a location I'm calling
varlocation
which is specified by the compiler for every variable.
- When the computer refers to a variable normally, it ends up looking like
mem[varlocation]
in the end code.
- When you reference the variable, you just get the numerical value of
varlocation
in the end code.
- When you dereference the variable, you get the value of
mem[mem[varlocation]]
in the code.
tl;dr - To actually answer the question...
//Your variables x and y and ptr
int x, y;
int *ptr;
//Store the location of x (x_location) in the ptr variable
ptr = &x; //Roughly: mem[ptr_location] = x_location;
//Initialize your x value with scanf
//Notice scanf takes the location of (a.k.a. pointer to) x to know where
//to put the value in memory
scanf("%d", &x);
y = *ptr; //Roughly: mem[y_location] = mem[mem[ptr_location]]
//Since 'mem[ptr_location]' was set to the value 'x_location',
//then that line turns into 'mem[y_location] = mem[x_location]'
//which is the same thing as 'y = x;'
Overall, you just missed the star to dereference the variable, as others have already pointed out.