How do you explain C++ pointers to a C#/Java developer?

Question

I am a C#/Java developer trying to learn C++. As I try to learn the concept of pointers, I am struck with the thought that I must have dealt with this concept before. How can pointers be explained using only concepts that are familiar to a .NET or Java developer? Have I really never dealt with this, is it just hidden to me, or do I use it all the time without calling it that?

You're familiar with the difference between reference and value types? — Justin Morgan - On strike, Mar 02 '11 at 23:15
You can also use pointers in C#. They work pretty much the same as in C++. — Carra, Mar 02 '11 at 23:17
@Justin Yes, of course. It never occurred to me that was related to pointers. — Michael Hedgpeth, Mar 02 '11 at 23:17
Part of it is just that I'm having a hard time understanding what problem pointers solve. They seem "entirely new" to me, but they're probably just a different way to solve a problem I solve everyday. — Michael Hedgpeth, Mar 02 '11 at 23:19
@Michael: They're closely related, but I haven't used C++ in years so I'm sure I'll get the details wrong if I try to explain. I'm going to defer to the C++ wizards around here to do a better job. — Justin Morgan - On strike, Mar 02 '11 at 23:25
Do you want to know what pointers are in general ? And their specific uses. — Mahesh, Mar 02 '11 at 23:27
@Mahesh yes, from a C#/Java perspective (this is what makes the question unique) — Michael Hedgpeth, Mar 02 '11 at 23:31
http://www.youtube.com/watch?v=i49_SNt4yfk Binky is a pointer genius. — TomF, Aug 26 '13 at 13:16

score 19 · Accepted Answer · edited Aug 26 '13 at 12:26

Java objects in C++

A Java object is the equivalent of a C++ shared pointer.

A C++ pointer is like a Java object without the garbage collection built in.

C++ objects.

C++ has three ways of allocating objects:

Static Storage Duration objects.
- These are created at startup (before main) and die after main exits.
  There are some technical caveats to that but that is the basics.
Automatic Storage Duration objects.
- These are created when declared and destroyed when they go out of scope.
  I believe these are like C# structs
Dynamic Storage Duration objects
- These are created via new and the closest to a C#/Java object (AKA pointers)
  Technically pointers need to be destroyed manually via delete. But this is considered bad practice and under normal situations they are put inside Automatic Storage Duration Objects (usually called smart pointers) that control their lifespan. When the smart pointer goes out of scope it is destroyed and its destructor can call delete on the pointer. Smart pointers can be though of as fine grain garbage collectors.
  
  The closest to Java is the shared_ptr, this is a smart pointer that keeps a count of the number of users of the pointer and deletes it when nobody is using it.

+1 Though I would add that the shared_ptr has significant differences to garbage collection in C#/Java - for instance circular references are an issue in C++ shared_ptr and not as much in C# (of course there are exceptions to this, but that's probably beyond the scope of this question). — Jeremy Bell, Mar 02 '11 at 23:42

score 7 · Answer 2 · edited Aug 26 '13 at 13:30

You are "using pointers" all the time in C#, it's just hidden from you.

The best way I reckon to approach the problem is to think about the way a computer works. Forget all of the fancy stuff of .NET: you have the memory, which just holds byte values, and the processor, which just does things to these byte values.

The value of a given variable is stored in memory, so is associated with a memory address. Rather than having to use the memory address all the time, the compiler lets you read from it and write to it using a name.

Furthermore, you can choose to interpret a value as a memory address at which you wish to find another value. This is a pointer.

For example, lets say our memory contains the following values:

Address [0] [1] [2] [3] [4] [5] [6] [7]
Data    5   3   1   8   2   7   9   4

Let's define a variable, x, which the compiler has chosen to put at address 2. It can be seen that the value of x is 1.

Let's now define a pointer, p which the compiler has chosen to put at address 7. The value of p is 4. The value pointed to by p is the value at address 4, which is the value 2. Getting at the value is called dereferencing.

An important concept to note is that there is no such thing as a type as far as memory is concerned: there are just byte values. You can choose to interpret these byte values however you like. For example, dereferencing a char pointer will just get 1 byte representing an ASCII code, but dereferencing an int pointer may get 4 bytes making up a 32 bit value.

Looking at another example, you can create a string in C with the following code:

char *str = "hello, world!";

What that does is says the following:

Put aside some bytes in our stack frame for a variable, which we'll call str.
This variable will hold a memory address, which we wish to interpret as a character.
Copy the address of the first character of the string into the variable.
(The string "hello, world!" will be stored in the executable file and hence will be loaded into memory when the program loads)

If you were to look at the value of str you'd get an integer value which represents an address of the first character of the string. However, if we dereference the pointer (that is, look at what it's pointing to) we'll get the letter 'h'.

If you increment the pointer, str++;, it will now point to the next character. Note that pointer arithmetic is scaled. That means that when you do arithmetic on a pointer, the effect is multiplied by the size of the type it thinks it's pointing at. So assuming int is 4 bytes wide on your system, the following code will actually add 4 to the pointer:

int *ptr = get_me_an_int_ptr();
ptr++;

If you end up going past the end of the string, there's no telling what you'll be pointing at; but your program will still dutifully attempt to interpret it as a character, even if the value was actually supposed to represent an integer for example. You may well be trying to access memory which is not allocated to your program however, and your program will be killed by the operating system.

A final useful tip: arrays and pointer arithmetic are the same thing, it's just syntactic sugar. If you have a variable, char *array, then

array[5]

is completely equivalent to

*(array + 5)

score 4 · Answer 3 · edited May 23 '17 at 12:11

A pointer is the address of an object.

Well, technically a pointer value is the address of an object. A pointer object is an object (variable, call it what you prefer) capable of storing a pointer value, just as an int object is an object capable of storing an integer value.

["Object" in C++ includes instances of class types, and also of built-in types (and arrays, etc). An int variable is an object in C++, if you don't like that then tough luck, because you have to live with it ;-)]

Pointers also have static type, telling the programmer and the compiler what type of object it's the address of.

What's an address? ~~It's one of those 0x-things with numbers and letters it it that you might sometimes have seen in a debugger~~. For most architectures we can consider memory (RAM, to over-simplify) as a big sequence of bytes. An object is stored in a region of memory. The address of an object is the index of the first byte occupied by that object. So if you have the address, the hardware can get at whatever's stored in the object.

The consequences of using pointers are in some ways the same as the consequences of using references in Java and C# - you're referring to an object indirectly. So you can copy a pointer value around between function calls without having to copy the whole object. You can change an object via one pointer, and other bits of code with pointers to the same object will see the changes. Sharing immutable objects can save memory compared with lots of different objects all having their own copy of the same data that they all need.

C++ also has something it calls "references", which share these properties to do with indirection but are not the same as references in Java. Nor are they the same as pointers in C++ (that's another question).

"I am struck with the thought that I must have dealt with this concept before"

Not necessarily. Languages may be functionally equivalent, in the sense that they all compute the same functions as a Turing machine can compute, but that doesn't mean that every worthwhile concept in programming is explicitly present in every language.

If you wanted to simulate the C memory model in Java or C#, though, I suppose you'd create a very large array of bytes. Pointers would be indexes in the array. Loading an int from a pointer would involve taking 4 bytes starting at that index, and multiplying them by successive powers of 256 to get the total (as happens when you deserialize an int from a bytestream in Java). If that sounds like a ridiculous thing to do, then it's because you haven't dealt with the concept before, but nevertheless it's what your hardware has been doing all along in response to your Java and C# code[*]. If you didn't notice it, then it's because those languages did a good job of creating other abstractions for you to use instead.

Literally the closest the Java language comes to the "address of an object" is that the default hashCode in java.lang.Object is, according to the docs, "typically implemented by converting the internal address of the object into an integer". But in Java, you can't use an object's hashcode to access the object. You certainly can't add or subtract a small number to a hashcode in order to access memory within or in the vicinity of the original object. You can't make mistakes in which you think that your pointer refers to the object you intend it to, but actually it refers to some completely unrelated memory location whose value you're about to scribble all over. In C++ you can do all those things.

[*] well, not multiplying and adding 4 bytes to get an int, not even shifting and ORing, but "loading" an int from 4 bytes of memory.

+1 though, strictly speaking, the fundamental types (int/long/pointers/etc...) are not under the C++ definition of an "object". An object in C++ is generally defined as an instance of a class or struct. This is another difference between C++ and most managed languages: C++ does not have the "everything is an object" abstraction. — Jeremy Bell, Mar 03 '11 at 14:31
@Jeremy: when the C++ standard refers to "an object" in general, that includes objects of builtin type. First obvious example I found, 1.9/9, does that mean "modifying an object" is *only* a "side effect" for objects of class type? Not all of the things that can be true of objects in general can be true of, say, `int` objects in particular, so confusion might be caused by statements like 1.8/2, "objects can contain other objects". This should be taken to mean "in general an object might have sub-objects", not to mean, "anything which cannot contain other objects is not an object". — Steve Jessop, Mar 03 '11 at 14:57
Some C++ programmers choose to use "object" to mean only instances of classes, but I think it's wrong to use standard terminology in conflict with the standard, unless they're very clear that they're talking about some academic OOP principle and hence using "object" in the OOP sense, not the C++ sense. This is why I say, "tough luck, you have to live with it". — Steve Jessop, Mar 03 '11 at 15:02
Good point. It's ironic that from a strictly academic sense, there is no real distinction other than that of mutability vs immutability, so you could say that the standard agreed with academic principles. From a practical standpoint however, the distinction is significant and the practical programmer needs distinct labels for the two categories. What does one use, in place of "object", to categorize an instance of an class or struct as opposed to an instance of an int or a pointer? — Jeremy Bell, Mar 03 '11 at 15:31
@Jeremy: I'd normally say "object of class type", or "class object" as a shorthand. You can get away with the latter in C++ because classes aren't objects - in Python it'd be ambiguous. It's not the only distinction you need to make, though, another important one is between POD and non-POD objects. That has practical effects for what you can actually do with the object, whereas you can write a class that behaves almost exactly like a builtin type. — Steve Jessop, Mar 03 '11 at 16:00

score 2 · Answer 4 · edited Aug 26 '13 at 13:15

2

Explain the difference between the stack and the heap and where objects go.

Value types such as structs (both C++ and C#) go on the stack. Reference types (class instances) get put on the heap. A pointer (or reference) points to the memory location on the heap for that specific instance.

Reference type is the key word. Using a pointer in C++ is like using ref keyword in C#.

Managed apps make working with this stuff easy so .NET devs are spared the hassle and confusion. Glad I don't do C anymore.

edited Aug 26 '13 at 13:15

bluish

26,356
27
122
180

answered Mar 02 '11 at 23:18

Dustin Davis

14,482
13
63
119

Not quite. Using the ref keyword in C# is like passing a parameter by reference in C++. – Jeremy Bell Mar 02 '11 at 23:21
yes, a reference. A pointer is a reference to object in memory. ref keyword passes the reference in memory for the object being passed, not the object itself. – Dustin Davis Mar 02 '11 at 23:24
That's true, but the distinction between pass-by-reference and pass-by-pointer is significant in C++. Passing a pointer by reference in C++ (equivalent to passing a class parameter using the ref keyword), is like passing a pointer to a pointer vs just passing a pointer to an object. The distinction is also important in how you can use the parameter within the function - parameters passed by reference can be used as if they were passed by value, whereas parameters passed by pointer must be dereferenced. – Jeremy Bell Mar 02 '11 at 23:33
2

@DustinDavis: there's a sizeable difference between a reference and a pointer in C++. A pointer is a variable that is interpreted as a memory address. A reference is an alias to another variable. – suszterpatt Mar 02 '11 at 23:35
2

Well, from an implementation standpoint, passing a parameter by reference in C++ is equivalent to passing a pointer to that parameter - the distinction is in syntax. It's a convenience for the programmer because you don't need to dereference the pointer in the function. In fact, the same is true for ref/out parameters in C#. The difference is that in C++ you can have a pointer to a pointer to a pointer, etc..., whereas in C# you can only have a "pointer" (reference to an object) or a "pointer to a pointer" (function parameter using the ref keyword, if the parameter is a class). – Jeremy Bell Mar 02 '11 at 23:51

score 2 · Answer 5 · answered Mar 02 '11 at 23:30

2

References in C# act the same way as pointers in C++, without all the messy syntax.

Consider the following C# code:

public class A
{
    public int x;
}

public void AnotherFunc(A a)
{
    a.x = 2;
}

public void SomeFunc()
{
    A a = new A();
    a.x = 1;

    AnotherFunc(a);
    // a.x is now 2
}

Since classes are references types, we know that we are passing an existing instance of A to AnotherFunc (unlike value types, which are copied).

In C++, we use pointers to make this explicit:

class A
{
public:
    int x;
};

void AnotherFunc(A* a) // notice we are pointing to an existing instance of A
{
    a->x = 2;
}

void SomeFunc()
{
    A a;
    a.x = 1;

    AnotherFunc(&a);
    // a.x is now 2
}

answered Mar 02 '11 at 23:30

Marlon

19,924
12
70
101

Note that in this example, passing a _reference_ would have been a better choice. Yes, C++ has references too. – fouronnes Mar 02 '11 at 23:34
"References in C# act the same way as pointers in C++, without all the messy syntax." Well, except for... *pointer arithmetic*! – R. Martinho Fernandes Mar 02 '11 at 23:35
1

I chose to use pointers instead of references to show the syntax difference between the languages. But you're right, references probably would have been more clear. – Marlon Mar 02 '11 at 23:38

Keith · Answer 6 · 2011-03-02T23:45:28.233

2

"How can pointers be explained using only concepts that are familiar to a .NET or Java developer? " I'd suggest that there are really two distinct things that need to be learnt.

The first is how to use pointers, and heap allocated memory, to solve specific problems. With an appropriate style, using shared_ptr<> for example, this can be done in a manner analogous to that of Java. A shared_ptr<> has a lot in common with a Java object handle.

Secondly, however, I would suggest that pointers in general are a fundamentally lower level concept that Java, and to a lesser extent C#, deliberately hides. To program in C++ without moving to that level will guarantee a host of problems. You need to think in terms of the underlying memory layout and think of pointers as literally pointers to specific pieces of storage.

To attempt to understand this lower level in terms of higher concepts would be an odd path to take.

edited Mar 02 '11 at 23:45

answered Mar 02 '11 at 23:40

Keith

6,756
19
23

1

+1 for your second point. Trying to understand pointers at a high level point a view will make them look ... pointless. – fouronnes Mar 02 '11 at 23:44
lol, this just reminds me of how my profs were always trying to convince me to think of value types as if they were just immutable reference types and that the differences were only in how the virtual machine optimized out the heap allocations for value types. Thus the reasoning behind many languages with no real concept of a value type. I'm stubborn though - I like my C# structs! :) – Jeremy Bell Mar 03 '11 at 00:06

score 2 · Answer 7 · answered Mar 03 '11 at 05:21

Get two sheets of large format graph paper, some scissors and a friend to help you.

Each square on the sheets of paper represents one byte.

One sheet is the stack.

The other sheet is the heap. Give the heap to your friend - he is the memory manager.

You are going to pretend to be a C program and you'll need some memory. When running your program, cut out chunks from the stack and the heap to represent memory allocation.

Ready?

void main() {
    int  a;                       /* Take four bytes from the stack. */
    int *b = malloc(sizeof(int)); /* Take four bytes from the heap. */

    a = 1;  /* Write on your first little bit of graph paper, WRITE IT! */
    *b = 2; /* Get writing (on the other bit of paper) */

    b = malloc(sizeof(int)); /* Take another four bytes from the heap. 
                                Throw the first 'b' away. Do NOT give it 
                                back to your friend */

    free(b); /* Give the four bytes back to your friend */
    *b = 3;  /* Your friend must now kill you and bury the body */
} /* Give back the four bytes that were 'a' */

Try with some more complex programs.

score 0 · Answer 8 · edited Aug 26 '13 at 13:38

In C#, all references to classes are roughly the equivalent to pointers in the C++ world. For value types (structs, ints, etc..) this is not the case.

C#:

void func1(string parameter)
void func2(int parameter)

C++:

void func1(string* parameter)
void func2(int parameter)

Passing a parameter using the ref keyword in C# is equivalent to passing a parameter by reference in C++.

C#:

void func1(ref string parameter)
void func2(ref int parameter)

C++:

void func1((string*)& parameter)
void func2(int& parameter)

If the parameter is a class, it would be like passing a pointer by reference.

score 0 · Answer 9 · answered Mar 02 '11 at 23:30

0

The key for me was to understand the way memory works. Variables are stored in memory. The places in which you can put variables in memory are numbered. A pointer is a variable that holds this number.

answered Mar 02 '11 at 23:30

fouronnes

3,838
23
41

score 0 · Answer 10 · answered Mar 02 '11 at 23:30

Any C# programmer that understands the semantic differences between classes and structs should be able to understand pointers. I.e., explaining in terms of value vs. reference semantics (in .NET terms) should get the point across; I wouldn't complicate things by trying to explain in terms of ref (or out).

How do you explain C++ pointers to a C#/Java developer?

10 Answers10

Java objects in C++

C++ objects.

Linked