1

This wikepedia page defines c++ as a "white space independent language". While mostly true as with all languages there are exceptions to the rule. The only one I can think of at the moment is this:

vector<vector<double> >

Must have a space otherwise the compiler interprets the >> as a stream operator. What other ones are around. It would be interesting to compile a list of the exceptions.

SingerOfTheFall
  • 29,228
  • 8
  • 68
  • 105
Fantastic Mr Fox
  • 32,495
  • 27
  • 95
  • 175
  • 6
    You don't have to worry about that one with C++11. – Mat Sep 06 '12 at 05:01
  • 2
    A rule was specifically added to C++11 to allow `vector>` – Praetorian Sep 06 '12 at 05:01
  • See [g++ 4.7 evaluates operator “” as sibling to macro expansion](http://stackoverflow.com/questions/11909806/g-4-7-evaluates-operator-as-sibling-to-macro-expansion) for a problem with C++11 user-defined literals, where you need to add a whitespace. – Jesse Good Sep 06 '12 at 05:22
  • @Mat: very true... you have so many more serious things to worry about in C++11 that that issue even if it was still present is really insignificant :-) – 6502 Sep 06 '12 at 05:33
  • You need a space when fully qualifying a template argument: `::std::vector< ::std::string> vec;`. – Dietmar Kühl Sep 06 '12 at 06:58

6 Answers6

11

Following that logic, you can use any two-character lexeme to produce such "exceptions" to the rule. For example, += and + = would be interpreted differently. I wouldn't call them exceptions though. In C++ in many contexts "no space at all" is quite different from "one or more spaces". When someone says that C++ is space-independent they usually mean that "one space" in C++ is typically the same as "more than one space".

This is reflected in the language specification, which states (see 2.1/1) that at phase 3 of translation the implementation is allowed to replace sequences of multiple whitespace characters with one space character.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
4

The syntax and semantic rules for parsing C++ are indeed quite complex (I'm trying to be nice, I think one is authorized to say "a mess"). Proof of this fact is that for YEARS different compiler authors where just arguing on what was legal C++ and what it was not.

In C++ for example you may need to parse an unbounded number of tokens before deciding what is the semantic meaning of the first of them (the dreaded "most vexing parse rule" that also often bites newcomers).

Your objection IMO however doesn't really make sense... for example ++ has a different meaning from + +, and in Pascal begin is not the same as beg in. Does this make Pascal a space-dependent language? Is there any space-independent language (except brainf*ck)?

The only problem about C++03 >>/> > is that this mistake when typing was very common so they decided to add even more complexity to the language definition to solve this issue in C++11.

The cases in which one whitespace instead of more whitespaces can make a difference (something that differentiates space-dependent languages and that however plays no role in the > > / >> case) are indeed few:

  1. inside double-quoted strings (but everyone wants that and every language that supports string literals that I know does the same)

  2. inside single quotes (the same, even if something that not many C++ programmers know is that there can be more that one char inside single quotes)

  3. in the preprocessor directives because they work on a line basis (newline is a whitespace and it makes a difference there)

  4. in line continuation as noticed by stefanv: to continue a single line you can put a backslash right before a newline and in that case the language will ignore both characters (you can do this even in the middle of an identifier, even if the typical use is just to make long preprocessor macros readable). If you put other whitespace characters after the backslash and before the newline however the line continuation is not recognized (some compiler accepts it anyway and simply checks if last non-whitespace of a line is a backslash). Line continuation can also be specified using trigraph equivalent ??/ of backslash (any reasonable compiler should IMO emit a warning when finding a trigraph as they most probably were not indented by the programmer).

  5. inside single-line comments // because also there adding a newline to other whitespaces in the middle of a comment makes a difference

6502
  • 112,025
  • 15
  • 165
  • 265
  • Its not really an objection that i have, i thought it was just an interesting question. >> and > > are not something that you would think of as space independent, i thought i wonder how many other examples are out there like this. – Fantastic Mr Fox Sep 06 '12 at 06:51
  • @Ben: examples of what? Of a valid token being also the prefix of a valid longer token? You have `< <<` `+= + =` `<=` `< =` `>= > =` and a gajillion more. About syntax-level problems that surprise programmers in C++ there are also trigraphs, `and`, `or` and other keywords that have been defined apparently just for the fun of it. About semantic surprises things like `std::string s; s=3.14;` that is perfectly valid C++ or things like `false["foo"]` that is also valid C++. The list of questionable parts is really endless... – 6502 Sep 06 '12 at 07:23
  • You should add that to your answer, ie. that is the sort of thing I was interested in. – Fantastic Mr Fox Sep 07 '12 at 01:42
  • @Ben: You should be careful about exposing C++ problems. For reasons I do not understand many C++ programmers (especially if they arrived at C++ not long ago) are overzealous about the language and saying anything that is less than praising about any part of the language will just trigger downvotes. You can instead get upvotes for nonsenses like using template metaprogramming to build an half-baked and broken implementation of binary literals (I would be happy to be joking about this, but I'm not... see http://stackoverflow.com/a/2611850/320726) – 6502 Sep 07 '12 at 08:22
  • Thanks for the advice, the question was really just interest for me. I dont believe that the stack exchange system would work at all if people only asked questions for votes. If people think my question is badly worded or off topic then they can down vote it. But as it stands I think the comment i made posed some interesting questions and has obviously stimulated plenty of thought, which i believe is the basis for a good question. – Fantastic Mr Fox Sep 07 '12 at 14:01
2

Like it or not, but macro's are also part of C++ and multi-line macro's should be separated with a backslash followed by EOL, no whitespace should be in between the backslash and the EOL.
Not a big issue, but still a whitespace exception.

stefaanv
  • 14,072
  • 2
  • 31
  • 53
  • A `backslash+newline` pair has nothing to do with macros, it's just the line continuation sequence. You can use it even in the middle of an identifier or of a string literal. There are no multi-line macros, they are on one single line that has been split using line continuations. – 6502 Sep 06 '12 at 14:33
  • @6502: technically correct as you are, for me, macro's are the only place where I needed the line continuation, making them look multi-line, hence my answer. But your comment is a valid explanation. – stefaanv Sep 06 '12 at 14:48
  • +1: if the question was about where one whitespace instead of more whitespaces make a difference indeed the line continuation is formally a meaningful case (however still this wouldn't make IMO the language a space dependent one from a practical point of view as line continuation is really close to sort of a "pre-language" part) – 6502 Sep 07 '12 at 05:39
1

This is because of limitations in the parser pre c++11 this is no longer the case.

The reason being that it was hard to parse >> as end of a template compared to operator >>

Adrian Cornish
  • 23,227
  • 13
  • 61
  • 77
1

While C++03 did interpret >> as the shift operator in all cases (which was overridden for use in streams, but it's still the shift operator), the language parser in C++11 will now attempt to close a brace when reasonable.

John Dibling
  • 99,718
  • 31
  • 186
  • 324
1
  • Nested template parameters: set<set<int> >.
  • Character literals: ' '.
  • String literals: " ".
  • Justoposition of keywords and identifiers: else return x;, void foo(){}, etc.
Thomas Eding
  • 35,312
  • 13
  • 75
  • 106