64

I'm not sure if this is a proper programming question, but it's something that has always bothered me, and I wonder if I'm the only one.

When initially learning C++, I understood the concept of references, but pointers had me confused. Why, you ask? Because of how you declare a pointer.

Consider the following:

void foo(int* bar)
{
}


int main()
{
    int x = 5;
    int* y = NULL;

    y = &x;
    *y = 15;     
    foo(y);

}

The function foo(int*) takes an int pointer as parameter. Since I've declared y as int pointer, I can pass y to foo, but when first learning C++ I associated the * symbol with dereferencing, as such I figured a dereferenced int needed to be passed. I would try to pass *y into foo, which obviously doesn't work.

Wouldn't it have been easier to have a separate operator for declaring a pointer? (or for dereferencing). For example:

void test(int@ x)
{
}
Archie
  • 6,391
  • 4
  • 36
  • 44
diggingforfire
  • 3,359
  • 1
  • 23
  • 33
  • 11
    This question can't be answered, only speculated upon. – bmargulies Dec 31 '11 at 00:58
  • What's wrong with genuine curiosity? – diggingforfire Dec 31 '11 at 01:13
  • @diggingforfire : Nothing, but that's not what SO is for. Try [Programmers.SE](http://programmers.stackexchange.com/) for discussion-based questions. – ildjarn Dec 31 '11 at 01:14
  • 20
    @bmargulies It can be answered directly; the creator of C wrote a document explaining exactly why this is so. – Crashworks Dec 31 '11 at 01:15
  • 4
    A question can be in the form of genuine curiosity, right? In fact, I find Crashworks' answer to be just that, a direct answer to my question. So why could this only be speculated upon? – diggingforfire Dec 31 '11 at 01:16
  • @ildjarn : telling people to go and create their own language implies I'm saying "geez c++ sucks who thought of this stupid idea", which is hardly the case here. – diggingforfire Dec 31 '11 at 01:30
  • 3
    @ildjarn: This isn't discussion-based, there's a clear answer to the question and we've all given it. – Stuart Golodetz Dec 31 '11 at 01:35
  • 2
    It might make sense to take out the `phooehy` case and reduce the code to a simplified `main`. I removed `reference` from the tags, as the question does not seem to be about references at all. Although the *related* question might be: "Why is the address-of operator (&) also used to declare a reference?" It is also overloaded in a similar fashion. –  Dec 31 '11 at 01:39
  • @diggingforfire : I said nothing like that, so I don't know what your point is. – ildjarn Dec 31 '11 at 02:14
  • @ildjarn: I was referring to Tomalak Geret'kal's comment, pardon the confusion. – diggingforfire Dec 31 '11 at 02:15
  • @Stuart : The only real answer here is the one quoting the language's creator; your answer in particular is purely subjective and subjective answers are not encouraged on SO. That was my point, and it still stands, in general. – ildjarn Dec 31 '11 at 02:15
  • Crashworks gave a much more helpful answer, which is why I upvoted it. But mine wasn't subjective, it came from hanging around in comp.lang.c++ and having come across the reason before from people who know more than I do :) Anyway, no worries. – Stuart Golodetz Dec 31 '11 at 02:38
  • 1
    Well, it depends...which are cuter, puppies or kittens? – John Fitzpatrick Jan 03 '12 at 19:08
  • 1
    Whichever gets me more rep, my man. – diggingforfire Jan 03 '12 at 19:10

6 Answers6

92

In The Development of the C Language, Dennis Ritchie explains his reasoning thusly:

The second innovation that most clearly distinguishes C from its predecessors is this fuller type structure and especially its expression in the syntax of declarations... given an object of any type, it should be possible to describe a new object that gathers several into an array, yields it from a function, or is a pointer to it.... [This] led to a declaration syntax for names mirroring that of the expression syntax in which the names typically appear. Thus,

int i, *pi, **ppi; declare an integer, a pointer to an integer, a pointer to a pointer to an integer. The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression.

Similarly, int f(), *f(), (*f)(); declare a function returning an integer, a function returning a pointer to an integer, a pointer to a function returning an integer. int *api[10], (*pai)[10]; declare an array of pointers to integers, and a pointer to an array of integers.

In all these cases the declaration of a variable resembles its usage in an expression whose type is the one named at the head of the declaration.

An accident of syntax contributed to the perceived complexity of the language. The indirection operator, spelled * in C, is syntactically a unary prefix operator, just as in BCPL and B. This works well in simple expressions, but in more complex cases, parentheses are required to direct the parsing. For example, to distinguish indirection through the value returned by a function from calling a function designated by a pointer, one writes *fp() and (*pf)() respectively. The style used in expressions carries through to declarations, so the names might be declared

int *fp(); int (*pf)();

In more ornate but still realistic cases, things become worse: int *(*pfp)(); is a pointer to a function returning a pointer to an integer.

There are two effects occurring. Most important, C has a relatively rich set of ways of describing types (compared, say, with Pascal). Declarations in languages as expressive as C—Algol 68, for example—describe objects equally hard to understand, simply because the objects themselves are complex. A second effect owes to details of the syntax. Declarations in C must be read in an `inside-out' style that many find difficult to grasp. Sethi [Sethi 81] observed that many of the nested declarations and expressions would become simpler if the indirection operator had been taken as a postfix operator instead of prefix, but by then it was too late to change.

ricmarques
  • 131
  • 1
  • 7
Crashworks
  • 40,496
  • 12
  • 101
  • 170
  • 7
    That is an excellent explanation, I wish we were actually being told these kind of things in class, it would've helped me understand pointers much sooner. – diggingforfire Dec 31 '11 at 01:52
  • 1
    This is very helpful! I've always wondered about the weird syntax of function pointers. Knowing the background makes reading and writing them much easier. Love this website <3 – pezcode Dec 31 '11 at 02:08
  • 3
    @Tomalak Geret'kal: Shucks, too bad I can't edit it now. Well, these things do happen when writing in a foreign language. – diggingforfire Dec 31 '11 at 02:10
  • 1
    @diggingforfire You may also like my "graph paper and pencil" technique for reasoning with pointers (example: http://stackoverflow.com/questions/7062853/c-pointer-assignment-question/7062888#7062888). I generally believe that learning what the machine actually does with pointers first makes learning the C abstraction easier; rather than trying to learn the abstraction first and then the concrete after. – Crashworks Dec 31 '11 at 02:38
  • I found the [Binking Pointer Fun](http://cslibrary.stanford.edu/104/) videos informative. If I taught a curriculum I would use them. –  Dec 31 '11 at 03:45
  • 2
    What part of this quote explains why prefix * was selected over alternatives? – David Heffernan Dec 31 '11 at 08:29
  • @David: None of it, as far as I can tell. – Lightness Races in Orbit Dec 31 '11 at 14:02
  • 1
    @DavidHeffernan - the "just as in BCPL and B" in BCPL all unary operators were prefix – mmmmmm Dec 31 '11 at 15:06
  • @Mark I thought the italicised sections were meant to be special. I bet that what happened was that BCPL took this route, C followed and then C++ followed. Understanding the reasons for the very original decision would require a time machine. – David Heffernan Dec 31 '11 at 15:54
  • 1
    @David Or you could read the rest of the linked document which describes the BCPL roots. – Crashworks Dec 31 '11 at 22:59
  • @Crashworks: I never once claimed that it was the former. I understand the question just fine, thanks! – Lightness Races in Orbit Jan 04 '12 at 02:30
  • @LightnessRacesinOrbit Tired of working for the Tal Shiar, eh? – Crashworks Jan 05 '12 at 08:59
17

The reason is clearer if you write it like this:

int x, *y;

That is, both x and *y are ints. Thus y is an int *.

Stuart Golodetz
  • 20,238
  • 4
  • 51
  • 80
  • Actually, this is not clearer if you expand it. I don't want to start an argument, but I'd like to point out that this syntax is bound to lead to one. – Lightness Races in Orbit Dec 31 '11 at 01:02
  • 1
    I agree that you shouldn't *use* this syntax in practice - it was merely to illustrate the point (which is the same point David makes). – Stuart Golodetz Dec 31 '11 at 01:03
  • Indeed, nothing against you or your answer (except perhaps the "clearer" part) – Lightness Races in Orbit Dec 31 '11 at 01:04
  • No offence taken :) Just trying to clarify. – Stuart Golodetz Dec 31 '11 at 01:05
  • This answer illustrates why the Q is closed as not constructive. – David Heffernan Dec 31 '11 at 08:40
  • 3
    Probably not the clearest answer in the world, but typed on an iPad hence shorter than normal. The point I was trying to make is that if you group the * with the variable name then it becomes clearer where this syntax came from - namely the idea that by saying * y is an int, you are implying that y itself is an int *. As this answer has been borne out in practice by the reference given (and it wasn't a guess), I'm not sure where the hostility is coming from. Anyway, let's agree to disagree. – Stuart Golodetz Dec 31 '11 at 12:41
  • I think hostility is perhaps overstating it. But you are drawing a bit of flak because you were vocal in criticising the close votes but then produced an answer that, to my mind and it would seem to the minds of others, perfectly captures why this question is not constructive: *This question is not a good fit to our Q&A format. We expect answers to generally involve facts, references, or specific expertise; this question will likely solicit opinion, debate, arguments, polling, or extended discussion.* But yeah, let's agree to disagree! ;-) – David Heffernan Dec 31 '11 at 14:11
12

That is a language decision that predates C++, as C++ inherited it from C. I once heard that the motivation was that the declaration and the use would be equivalent, that is, given a declaration int *p; the expression *p is of type int in the same way that with int i; the expression i is of type int.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
10

Because the committee, and those that developed C++ in the decades before its standardisation, decided that * should retain its original three meanings:

  • A pointer type
  • The dereference operator
  • Multiplication

You're right to suggest that the multiple meanings of * (and, similarly, &) are confusing. I've been of the opinion for some years that it they are a significant barrier to understanding for language newcomers.


Why not choose another symbol for C++?

Backwards-compatibility is the root cause... best to re-use existing symbols in a new context than to break C programs by translating previously-not-operators into new meanings.


Why not choose another symbol for C?

It's impossible to know for sure, but there are several arguments that can be — and have been — made. Foremost is the idea that:

when [an] identifier appears in an expression of the same form as the declarator, it yields an object of the specified type. {K&R, p216}

This is also why C programmers tend to[citation needed] prefer aligning their asterisks to the right rather than to the left, i.e.:

int *ptr1; // roughly C-style
int* ptr2; // roughly C++-style

though both varieties are found in programs of both languages, varyingly.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 3
    I don't think the committee decided anything; it was probably Dennis Ritchie when he created C. – Brian Neal Dec 31 '11 at 01:07
  • The committee is responsible for the language we know today. Fair point, it started with DR, but that's not "C" as we know it now. And, more than much else, the C++ committee has the conscious decision for the C++ language -- Ritchie did not. – Lightness Races in Orbit Dec 31 '11 at 01:07
  • 8
    Before there was a committee, Ritchie invented that syntax. Stroustrup was not going to break backwards compatibility with C by not using it. All these decisions were made before the development of C and C++ was put under a standards body. – Brian Neal Dec 31 '11 at 01:14
  • 3
    @BrianNeal: Fine, but the languages as we know them are governed by those committees, and those committees are -- currently -- responsible for that decision. Unless you want to credit that first protein in the primordial soup. – Lightness Races in Orbit Dec 31 '11 at 01:58
6

Haha, I feel your pain, I had the exact same problem.

I thought a pointer should be declared as &int because it makes sense that a pointer is an address of something.

After a while I thought for myself, every type in C can be read backwards, like

int * const x

can be read as

x const * int

A constant x, when dereferenced (signaled with *) is of type int. So something that has to be dereferenced, has to be a pointer.

Binarian
  • 12,296
  • 8
  • 53
  • 84
6

Page 65 of Expert C Programming: Deep C Secrets includes the following: And then, there is the C philosophy that the declaration of an object should look like its use.

Page 216 of The C Programming Language, 2nd edition (aka K&R) includes: A declarator is read as an assertion that when its identifier appears in an expression of the same form as the declarator, it yields an object of the specified type.

I prefer the way van der Linden puts it.

sarnold
  • 102,305
  • 22
  • 181
  • 238