C pointers and references

Question

I would like to know what's really happening calling & and * in C.
Is that it costs a lot of resources? Should I call & each time I wanna get an adress of a same given variable or keep it in memory i.e in a cache variable. Same for * i.e when I wanna get a pointer value ?

Example

void        bar(char *str)
{
    check_one(*str)
    check_two(*str)

    //... Could be replaced by

    char    c = *str;

    check_one(c);
    check_two(c);
}

Have you looked at the generated assembly? A compiler will most likely optimize the first form into the second. — StoryTeller - Unslander Monica, Nov 07 '16 at 16:50
Possible duplicate of [When to use References vs. Pointers](http://stackoverflow.com/questions/7058339/when-to-use-references-vs-pointers) — Eli Sadoff, Nov 07 '16 at 16:50
It costs nearly nothing, so stop worrying about it and concentrate on the logic. — Eugene Sh., Nov 07 '16 at 16:51
@StoryTeller it can't: `check_one` might end up modifying the char pointed to by `str`. — Quentin, Nov 07 '16 at 16:52
@StoryTeller suppose that `str` points at some global `char` variable, that is modified by `check_one`. Devious, but possible. — Quentin, Nov 07 '16 at 16:53
@EliSadoff yes, but It is not duplicate with that question right. I replied to Story's comment above that C doesn't have reference type — Danh, Nov 07 '16 at 16:53
@Quentin if the function signature is `check_one(char c)` (note there is no star there), I fail to see how that's possible. — StoryTeller - Unslander Monica, Nov 07 '16 at 16:54
I tagged this as a duplicate because it seemed that despite being tagged as C it was about C++ mostly because references don't exist within C. — Eli Sadoff, Nov 07 '16 at 16:55
@Romain-p are you using C or C++? This question is relatively confusing if it's about C because references don't exist in C. — Eli Sadoff, Nov 07 '16 at 16:56
@StoryTeller `char g = 'a'; void check_one(char c) { g = 'b'; } /* ... */ bar(&g);` -- in this case, `check_two` will receive `'b'`, not `'a'`. — Quentin, Nov 07 '16 at 16:56
yea but this sign '&' means reference for me i.e the address — Romain-p, Nov 07 '16 at 16:56
@EliSadoff, just because C++ uses "reference" as a technical term, doesn't mean the rest of the English speaking world is banned from using it as colloquial for pass-by-pointer. Since this question isn't tagged C++, there is no confusion — StoryTeller - Unslander Monica, Nov 07 '16 at 16:57
@Quentin, got ya. And I am again reminded of the evil that is globals — StoryTeller - Unslander Monica, Nov 07 '16 at 16:59
aight sorry! i was meaning the adress so! Yet, we call *var dereference — Romain-p, Nov 07 '16 at 17:00
@EliSadoff Well, it is possible to de-reference in C, there's an operator for that... It'd be rather strange to de-reference a non-reference, now wouldn't it. Therefore it follows, that C must be able to have references ;-) — hyde, Nov 07 '16 at 17:03
@hyde I think the official name is "Indirection". No "de-reference" operator in the C standard. — Eugene Sh., Nov 07 '16 at 17:04
@EugeneSh. Hmm, for `*` yeah, but if wikipedia is correct, `->` is called struct *dereference* operator (didn't check the standard text). — hyde, Nov 07 '16 at 18:44

score 4 · Accepted Answer · edited Nov 07 '16 at 17:10

I would like to know what's really happening calling & and * in C.

There's no such thing as "calling" & or *. They are the address operator, or the dereference operator, and instruct the compiler to work with the address of an object, or with the object that a pointer points to, respectively.

And C is not C++, so there's no references; I think you just misused that word in your question's title.

In most cases, that's basically two ways to look at the same thing.

Usually, you'll use & when you actually want the address of an object. Since the compiler needs to handle objects in memory with their address anyway, there's no overhead.

For the specific implications of using the operators, you'll have to look at the assembler your compiler generates.

Example: consider this trivial code, disassembled via godbolt.org:

#include <stdio.h>
#include <stdlib.h>

void check_one(char c)
{
    if(c == 'x')
        exit(0);
}

void check_two(char c)
{
    if(c == 'X')
        exit(1);
}

void foo(char *str)
{
    check_one(*str);
    check_two(*str);
}

void bar(char *str)
{
    char c = *str;
    check_one(c);
    check_two(c);
}

int main()
{
    char msg[] = "something";
    foo(msg);
    bar(msg);
}

The compiler output can far wildly depending on the vendor and optimization settings.

clang 3.8 using -O2

check_one(char):                          # @check_one(char)
        movzx   eax, dil
        cmp     eax, 120
        je      .LBB0_2
        ret
.LBB0_2:
        push    rax
        xor     edi, edi
        call    exit

check_two(char):                          # @check_two(char)
        movzx   eax, dil
        cmp     eax, 88
        je      .LBB1_2
        ret
.LBB1_2:
        push    rax
        mov     edi, 1
        call    exit

foo(char*):                               # @foo(char*)
        push    rax
        movzx   eax, byte ptr [rdi]
        cmp     eax, 88
        je      .LBB2_3
        movzx   eax, al
        cmp     eax, 120
        je      .LBB2_2
        pop     rax
        ret
.LBB2_3:
        mov     edi, 1
        call    exit
.LBB2_2:
        xor     edi, edi
        call    exit

bar(char*):                               # @bar(char*)
        push    rax
        movzx   eax, byte ptr [rdi]
        cmp     eax, 88
        je      .LBB3_3
        movzx   eax, al
        cmp     eax, 120
        je      .LBB3_2
        pop     rax
        ret
.LBB3_3:
        mov     edi, 1
        call    exit
.LBB3_2:
        xor     edi, edi
        call    exit

main:                                   # @main
        xor     eax, eax
        ret

Notice that foo and bar are identical. Do other compilers do something similar? Well...

gcc x64 5.4 using -O2

check_one(char):
        cmp     dil, 120
        je      .L6
        rep ret
.L6:
        push    rax
        xor     edi, edi
        call    exit
check_two(char):
        cmp     dil, 88
        je      .L11
        rep ret
.L11:
        push    rax
        mov     edi, 1
        call    exit
bar(char*):
        sub     rsp, 8
        movzx   eax, BYTE PTR [rdi]
        cmp     al, 120
        je      .L16
        cmp     al, 88
        je      .L17
        add     rsp, 8
        ret
.L16:
        xor     edi, edi
        call    exit
.L17:
        mov     edi, 1
        call    exit
foo(char*):
        jmp     bar(char*)
main:
        sub     rsp, 24
        movabs  rax, 7956005065853857651
        mov     QWORD PTR [rsp], rax
        mov     rdi, rsp
        mov     eax, 103
        mov     WORD PTR [rsp+8], ax
        call    bar(char*)
        mov     rdi, rsp
        call    bar(char*)
        xor     eax, eax
        add     rsp, 24
        ret

Well, if there were any doubt foo and bar are equivalent, a least by the compiler, I think this:

foo(char*):
        jmp     bar(char*)

is a strong argument they indeed are.

Okay but imagine, I have a var: `char c = 'a'`. What's going on if I do: `&&c` — Romain-p, Nov 07 '16 at 17:03
`&c` gives me the adress of C and `&&c` should create a new memory case containing the adress of the adress of the char — Romain-p, Nov 07 '16 at 17:04
@Romain-p that's not true. `&c` is just a temporary value. Whether or not it gets its own memory depends on what you do with it. You can't `&&c`, since `&c` itself doesn't have an address. Did you try compiling that code? — Marcus Müller, Nov 07 '16 at 17:05
@Romain-p That is only valid in C++ and that results in a rvalue-reference in c++11 onwards or error in pre c++11 — smac89, Nov 07 '16 at 17:08
Of course, that's why I write "should", cause & should give me an adress of a given var, and it doesnt work in some cases — Romain-p, Nov 07 '16 at 17:08
@Romain-p what you think "should" be the case for C doesn't make any difference, to be honest :) In my world, `&(&c)` *should* not work (and see, it doesn't work) in C because `&c` is an rvalue, and thus simply doesn't have an address, simple as that. — Marcus Müller, Nov 07 '16 at 17:10
@John Bode: Given `char **var`, `var[0] (=) *var` returns me an address (pointer) pointing on the memory case containing my char. `**var (=) var[0][0]` should also return me a char. So I thought it works in the other sens i.e given `char c`, `&c` should give me the address of my char c. And `&(&c)` could create a new pointer — Romain-p, Nov 07 '16 at 17:14
@Romain-p: You can obtain the address of an *object* that contains a pointer value, but you cannot obtain the address *of an address*. `&(&c)` can't work in C, not only because the operation is nonsensical, but because the operand of unary `&` must be an *lvalue* (an expression that refers to an object such that the object may be read and/or modifed); however, the *result* of using the unary `&` operator is *not* an lvalue - it doesn't refer to an object, it's just a value (just like you can't assign to a new value to the result of `1+2`). — John Bode, Nov 07 '16 at 18:16

John Bode · Answer 2 · 2016-11-07T17:00:33.947

2

In C, there's no runtime cost associated with either the unary & or * operators; both are evaluated at compile time. So there's no difference in runtime between

check_one(*str)
check_two(*str)

and

 char c = *str;
 check_one( c );
 check_two( c );

ignoring the overhead of the assignment.

That's not necessarily true in C++, since you can overload those operators.

edited Nov 07 '16 at 17:00

answered Nov 07 '16 at 16:54

John Bode

119,563
19
122
198

smac89 · Answer 3 · 2016-11-07T17:11:21.760

0

tldr;

If you are programming in C, then the & operator is used to obtain the address of a variable and * is used to get the value of that variable, given it's address.

This is also the reason why in C, when you pass a string to a function, you must state the length of the string otherwise, if someone unfamiliar with your logic sees the function signature, they could not tell if the function is called as bar(&some_char) or bar(some_cstr).

To conclude, if you have a variable x of type someType, then &x will result in someType* addressOfX and *addressOfX will result in giving the value of x. Functions in C only take pointers as parameters, i.e. you cannot create a function where the parameter type is &x or &&x

Also your examples can be rewritten as:

check_one(str[0])
check_two(str[0])

edited Nov 07 '16 at 17:11

answered Nov 07 '16 at 17:03

smac89

39,374
15
132
179

Given `char **var`, `var[0] (=) *var` returns me an address (pointer) pointing on the memory case containing my char. `**var (=) var[0][0]` should also return me a char. So I thought it works in the other sens i.e given `char c`, `&c` should give me the address of my char c. And &(&c) could create a new pointer pointing on my first pointer pointing on my char – Romain-p Nov 07 '16 at 17:26
@Romain-p it still works that way, the only problem is that you have to split it into multiple statements for it to work. So `char *p = &c`, then `char **pp = &p`. C limits you to only using one `&` operator because `&c` will produce an rvalue, this means it produces a variable which does not have an address, so doing `&&c` is like trying to take the address of a variable which does not have an address...whaaa?? Stop it – smac89 Nov 07 '16 at 17:32

score 0 · Answer 4 · answered Nov 07 '16 at 17:17

AFAIK, in x86 and x64 your variables are stored in memory (if not stated with register keyword) and accessed by pointers. const int foo = 5 equal to foo dd 5 and check_one(*foo) equal to push dword [foo]; call check_one.

If you create additional variable c, then it looks like:

c resd 1
...
mov eax, [foo]
mov dword [c], eax ; Variable foo just copied to c
push dword [c]
call check_one

And nothing changed, except additional copying and memory allocation. I think that compiler's optimizer deals with it and makes both cases as fast as it is possible. So you can use more readable variant.

C pointers and references

4 Answers4