80

I am moving towards C++11 from C++98 and have become familiar with the auto keyword. I was wondering why we need to explicitly declare auto if the compiler is able to automatically deduce the type. I know C++ is a strongly typed language and this is a rule but was it not possible to achieve the same outcome without explicitly declaring a variable auto?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Farsan Rashid
  • 1,460
  • 4
  • 17
  • 28
  • Remember that C family is case sensitive. Go debug some JS code where author omitted "var" and use separate variable with names like "Bob", "bob" and "boB". Ugh. – PTwr May 23 '18 at 09:53
  • 47
    Even if it were possible it would be a moderately terrible idea. [Arguably the biggest weakness of Python](https://softwareengineering.stackexchange.com/a/30098/2366) (and similar languages) is the lack of a declaration syntax. It’s a major source of bugs that simple assignment will create a new variable. – Konrad Rudolph May 23 '18 at 10:58
  • @KonradRudolph: JS isn't any better with its declaration syntax though. I think what they meant to say was that it's the inability to restrict the scope of a variable in a fine-grained fashion. – user541686 May 24 '18 at 05:03
  • @Mehrdad Only because the syntax it isn’t mandatory in JavaScript. If it were then, yes, that would be a lot better. – Konrad Rudolph May 24 '18 at 09:27
  • @KonradRudolph: Hm, not really... `var` *is* mandatory—it declares that the variable is local. If you don't use `var`, then it's no longer a local variable, so it actually changes the meaning. – user541686 May 24 '18 at 09:48
  • 4
    @Mehrdad “changes the semantics” ≠ “mandatory”. The problem is that JavaScript *does* accept implicit declarations. Yes, they are semantically different but that doesn’t help in the slightest. – Konrad Rudolph May 24 '18 at 11:19
  • The problem with implicit variable declaration is that every typo on the left hand of an assignment statement becomes a runtime bug. – 17 of 26 May 24 '18 at 18:35
  • 1
    See also the dual question "why do real Perl users use "my" keyword https://stackoverflow.com/questions/8023959/why-use-strict-and-warnings/8024241#8024241 – Yann TM May 24 '18 at 20:23
  • All that being said, I'd still rather debug a Python program than the same thing written in C++ or JavaScript. You can use Cython if you want to explicitly declare stuff, too, including specific types. – Brōtsyorfuzthrāx May 24 '18 at 23:11

7 Answers7

156

Dropping the explicit auto would break the language:

e.g.

int main()
{
    int n;
    {
        auto n = 0; // this shadows the outer n.
    }
}

where you can see that dropping the auto would not shadow the outer n.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 8
    Was typing the exact same thing. Differentiating assignment from initialization would require an arbitrary choice on the standard's part. Since we already have the rule that "anything that could be a declaration is a declaration", we tread into very murky water. – StoryTeller - Unslander Monica May 23 '18 at 08:19
  • 1
    @StoryTeller: Indeed, no harm in having more than one answer, and you're typically more thorough than I am. And I'd up it. – Bathsheba May 23 '18 at 08:19
  • 1
    Besides, it would still be needed after all, even if optional in some cases. It would hardly be practical to remove `auto` from constructs like `auto const &x = f();`, not speaking of `auto [a, b]{g()};` – bipll May 23 '18 at 08:20
  • Nah, I think *"break the language"* sums it up better. You know how important it is to have a concise term for a concept when talking about something :P I just added a definition for the curious reader ;) – StoryTeller - Unslander Monica May 23 '18 at 08:21
  • 4
    This is not a problem. Like golang, you can obviously use something like `n := 0` to introduce a new variable. Why `auto` is used is an opinion based question. – llllllllll May 23 '18 at 08:23
  • 23
    @liliscent - Is it opinion based? (a) It was already a reserved keyword. (b) The meaning is quite clear. (c) It avoids the need of introducing new tokens (like `:=`). (d) It already fits in the grammar. I think there is very little room for opinion here. – StoryTeller - Unslander Monica May 23 '18 at 08:24
  • 2
    @StoryTeller You don't need new tokens if `x = f()` declares a new variable (if not yet existing), getting the type of f`s return value... Requiring auto to explicitly declare a variable, however, reduces the risk of declaring new variables by accident (e. g. because of a typo...). – Aconcagua May 23 '18 at 08:27
  • 34
    @Aconcagua - *"declares a new variable (if not yet existing)"* But shadowing *is* part of the language, and still must work, like Bathsheba illustrates. It's a bigger problem than one imagines. It's not about language design from scratch, it's about changing a living breathing language. Much harder to do. Kinda like changing the wheel on a speeding car. – StoryTeller - Unslander Monica May 23 '18 at 08:29
  • @StoryTeller As far as I read the question, it is not about dropping auto entirely, but to make it optional. So if you explicitly wanted to shadow, you still could use auto... – Aconcagua May 23 '18 at 08:31
  • @Aconcagua - I read it as "why do I even need `auto`". Which I think is answered beautifully. I must say I didn't consider the point of view you present. If you wish to explore it in another answer, ping me so I can upvote :) – StoryTeller - Unslander Monica May 23 '18 at 08:32
  • What about declaration without `auto` using list initialization syntax? `n{0};`. I think this could've worked out but making variable declaration less noticeable does not seem like a good idea. – user7860670 May 23 '18 at 08:33
  • Actually this is how it already works in lamba capture list variable declarations, except that type is deduced using `decltype(auto)` rules, if i'm not mistaken. `[n{0}](){ std::cout << n; }();` – user7860670 May 23 '18 at 08:37
40

Your question allows two interpretations:

  • Why do we need 'auto' at all? Can't we simply drop it?
  • Why are we obliged to use auto? Can't we just have it implicit, if it is not given?

Bathsheba answered nicely the first interpretation, for the second, consider the following (assuming no other declarations exist so far; hypothetically valid C++):

int f();
double g();

n = f(); // declares a new variable, type is int;
d = g(); // another new variable, type is double

if(n == d)
{
    n = 7; // reassigns n
    auto d = 2.0; // new d, shadowing the outer one
}

It would be possible, other languages get away quite well with (well, apart from the shadowing issue perhaps)... It is not so in C++, though, and the question (in the sense of the second interpretation) now is: Why?

This time, the answer is not as evident as in the first interpretation. One thing is obvious, though: The explicit requirement for the keyword makes the language safer (I do not know if this is what drove the language committee to its decision, still it remains a point):

grummel = f();

// ...

if(true)
{
    brummel = f();
  //^ uh, oh, a typo...
}

Can we agree on this not needing any further explanations?

The even bigger danger in not requiring auto, [however], is that it means that adding a global variable in a place far away from a function (e.g. in a header file) can turn what was intended to be the declaration of a locally-scoped variable in that function into an assignment to the global variable... with potentially disastrous (and certainly very confusing) consequences.

(cited psmears' comment due to its importance - thanks for hinting to)

Aconcagua
  • 24,880
  • 4
  • 34
  • 59
  • 24
    The even bigger danger in not requiring `auto`, in my view, is that it means that adding a global variable in a place far away from a function (e.g. in a header file) can turn what was intended to be the declaration of a locally-scoped variable in that function into an assignment to the global variable... with potentially disastrous (and certainly very confusing) consequences. – psmears May 23 '18 at 13:54
  • 1
    @psmears Languages like Python avoid that by requiring explicit specification of a variable as global/nonlocal for assignments; by default it simply creates a new local variable with that name. (Of course, you can read from a global variable without needing the `global ` statement.) That would require even more modification to the C++ language, of course, so probably wouldn't be feasible. – JAB May 23 '18 at 22:02
  • @JAB - yep, I'm aware of that... I didn't mention it because, as you say, it would require even more modification to the language :) – psmears May 24 '18 at 06:49
  • FWIW, what drove the language committee is most likely history. AFAIK, when C was originally written, local variables were saved on the stack, and C required all variables to be explicitly declared first in a block. That allowed the compiler to determine the storage requirement for that block before compiling the rest of the code, and allowed it to issue the correct instruction sequence to allocate space on the stack. IIRC `MOV R6 R5` `SUB #nnn R6` on a PDP-11 assuming R5 is used as the frame pointer and R6 is the stack pointer. nnn is the number of bytes of storage needed. – dgnuff May 24 '18 at 23:04
  • 2
    People do manage to use Python, which happily declares a variable every time a new name appears on the left side of an assignment (even if that name is a typo). But I do consider it one of the language's more serious flaws. – hobbs May 25 '18 at 07:16
15

was it not possible to achieve the same outcome without explicitly declaring a variable auto?

I am going to rephrase your question slightly in a way that will help you understand why you need auto:

Was it not possible to achieve the same outcome without explicitly using a type placeholder?

Was it not possible? Of course it was "possible". The question is whether it would be worth the effort to do it.

Most syntaxes in other languages that do not typenames work in one of two ways. There's the Go-like way, where name := value; declares a variable. And there's the Python-like way, where name = value; declares a new variable if name has not previously been declared.

Let's assume that there are no syntactic issues with applying either syntax to C++ (even though I can already see that identifier followed by : in C++ means "make a label"). So, what do you lose compared to placeholders?

Well, I can no longer do this:

auto &name = get<0>(some_tuple);

See, auto always means "value". If you want to get a reference, you need to explicitly use a &. And it will rightly fail to compile if the assignment expression is a prvalue. Neither of the assignment-based syntaxes has a way to differentiate between references and values.

Now, you could make such assignment syntaxes deduce references if the given value is a reference. But that would mean that you can't do:

auto name = get<0>(some_tuple);

This copies from the tuple, creating an object independent of some_tuple. Sometimes, that's exactly what you want. This is even more useful if you want to move from the tuple with auto name = get<0>(std::move(some_tuple));.

OK, so maybe we could extend these syntaxes a bit to account for this distinction. Maybe &name := value; or &name = value; would mean to deduce a reference like auto&.

OK, fine. What about this:

decltype(auto) name = some_thing();

Oh that's right; C++ actually has two placeholders: auto and decltype(auto). The basic idea of this deduction is that it works exactly as if you had done decltype(expr) name = expr;. So in our case, if some_thing() is an object, it will deduce an object. If some_thing() is a reference, it will deduce a reference.

This is very useful when you're working in template code and are not sure exactly what the return value of a function will be. This is great for forwarding, and it is an essential tool, even if it is not widely used.

So now we need to add more to our syntax. name ::= value; means "do what decltype(auto) does". I don't have an equivalent for the Pythonic variant.

Looking at this syntax, isn't that rather easy to accidentally mis-type? Not only that, it's hardly self-documenting. Even if you've never seen decltype(auto) before, it's big and obvious enough that you can at least easily tell that there's something special going on. Whereas the visual difference between ::= and := is minimal.

But that's opinion stuff; there are more substantive issues. See, all of this is based on using assignment syntax. Well... what about places where you can't use assignment syntax? Like this:

for(auto &x : container)

Do we change that to for(&x := container)? Because that seems to be saying something very different from range-based for. It looks like it's the initializer statement from a regular for loop, not a range-based for. It would also be a different syntax from non-deduced cases.

Also, copy-initialization (using =) is not the same thing in C++ as direct-initialization (using constructor syntax). So name := value; may not work in cases where auto name(value) would have.

Sure, you could declare that := will use direct-initialization, but that would be quite in-congruent with the way the rest of C++ behaves.

Also, there's one more thing: C++14. It gave us one useful deduction feature: return type deduction. But this is based on placeholders. So much like range-based for, it is fundamentally based on a typename that gets filled in by the compiler, not by some syntax applied to a particular name and expression.

See, all of these problems come from the same source: you're inventing entirely new syntax for declaring variables. Placeholder-based declarations didn't have to invent new syntax. They're using the exact same syntax as before; they're just employing a new keyword that acts like a type, but has a special meaning. This is what allows it to work in range-based for and for return type deduction. It is what allows it to have multiple forms (auto vs. decltype(auto)). And so forth.

Placeholders work because they are the simplest solution to the problem, while simultaneously retaining all of the benefits and generality of using an actual type name. If you came up with another alternative that worked as universally as placeholders do, it is highly unlikely that it would be as simple as placeholders.

Unless it was just spelling placeholders with different keywords or symbols...

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • 2
    IMHO, this is the only answer that addresses some substantial rationale behind the choice of a placeholder. Type deduction in generic lambda might be another example. It's a pity that this answer got so few upvotes just because it was posted a little late... – llllllllll May 27 '18 at 15:35
  • @liliscent: "*Type deduction in generic lambda might be another example.*" I didn't mention that because it has different semantics from `auto` declarations/return value deduction. – Nicol Bolas May 27 '18 at 16:25
  • @liliscent: Indeed, this answer is late to the party. Upped. (One improvement though would be a mention of the C++ idea that if something *could* be a declaration then it *is* a declaration.) – Bathsheba May 30 '18 at 06:17
12

In short: auto could be dropped in some cases, but that would lead to inconsistency.

First of all, as pointed, the declaration syntax in C++ is <type> <varname>. Explicit declarations require some type or at least a declaration keyword in its place. So we could use var <varname> or declare <varname> or something, but auto is a long standing keyword in C++, and is a good candidate for automatic type deduction keyword.

Is it possible to implicitly declare variables by assignment without breaking everything?

Sometimes yes. You can't perform assignment outside functions, so you could use assignment syntax for declarations there. But such approach would bring inconsistency to the language, possibly leading to human errors.

a = 0; // Error. Could be parsed as auto declaration instead.
int main() {
  return 0;
}

And when it comes to any kind of local variables explicit declarations are they way of controlling the scope of a variable.

a = 1; // use a variable declared before or outside
auto b = 2; // declare a variable here

If ambiguous syntax was allowed, declaring global variables could suddenly convert local implicit declarations to assignments. Finding those conversions would require checking everything. And to avoid collisions you would need unique names for all globals, which kind of destroys the whole idea of scoping. So it's really bad.

Andrew Svietlichnyy
  • 743
  • 1
  • 6
  • 13
11

auto is a keyword which you can use in places where you normally need to specify a type.

  int x = some_function();

Can be made more generic by making the int type automatically deduced:

  auto x = some_function();

So it's a conservative extension to the language; it fits into the existing syntax. Without it x = some_function() becomes an assignment statement, no longer a declaration.

rustyx
  • 80,671
  • 25
  • 200
  • 267
9

syntax has to be unambiguous and also backward compatible.

If auto is dropped there will be no way to distinguish between statements and definitions.

auto n = 0; // fine
n=0; // statememt, n is undefined.
code707
  • 1,663
  • 1
  • 8
  • 20
  • 3
    Important point is that `auto` was already a keyword (but with obsolete meaning), so it didn't break the code using it as a name. Which is a reason while a better keyword, like `var` or `let`, wasn't picked instead. – Frax May 23 '18 at 14:06
  • 1
    @Frax IMO `auto` is actually a pretty excellent keyword for this: it expresses exactly what it stands for, namely, it replaces a type name with “the automatic type”. With a keyword like `var` or `let`, you should consequently require the keyword _even if_ the type is specified explicitly, i.e. `var int n = 0` or something like `var n:Int = 0`. This is basically how it's done in Rust. – leftaroundabout May 23 '18 at 14:49
  • 1
    @leftaroundabout While `auto` is definitely excellent in context of existing syntax, I would say that something like `var int x = 42` being the basic variable definition, with `var x = 42` and `int x = 42` as shorthands, would make more sense than current syntax if considered out of historical content. But it's mostly the matter of taste. But, you are right, I should have written "one of the reasons" instead of "a reason" in my original comment :) – Frax May 23 '18 at 15:28
  • @leftaroundabout: _"auto is actually a pretty excellent keyword for this: it expresses exactly what it stands for, namely, it replaces a type name with 'the automatic type'"_ That is not true. There is no "the automatic type". – Lightness Races in Orbit May 24 '18 at 14:44
  • @LightnessRacesinOrbit in any given context in which you can use `auto`, there is an automatic type (a different one, depending on the expression). – leftaroundabout May 24 '18 at 15:03
  • @leftaroundabout: There is a type that is automatically deduced, yes, but you appeared to claim that there is a single ("the") "automatic" type, which is a common misconception. :) – Lightness Races in Orbit May 24 '18 at 15:08
3

Adding to previous answers, one extra note from an old fart: It looks like you may see it as an advantage to be able to just start using a new variable without in any way declaring it.

In languages with the possibility of implicit definition of variables this can be a big problem, especially in larger systems. You make one typo and you debug for hours only to find out you unintentionally introduced a variable with a value of zero (or worse) - blue vs bleu, label vs lable ... the result is you can't really trust any code without thorough checking on precise variable names.

Just using auto tells both compiler and maintainer that it is your intention to declare a new variable.

Think about it, to be able to avoid this sort of nightmares the 'implicit none' statement was introduced in FORTRAN - and you see it used in all serious FORTRAN programs nowadays. Not having it is simply ... scary.

Bert Bril
  • 371
  • 2
  • 12