0

Python uses whitespace to denote blocks:

for x in range(0, 50):
    print x

In JavaScript we use curly braces for that purpose:

for (x in range(0, 50)) { print(x) }

I'm wondering if whitespaces has any significance in EcmaScript when parsing a program and creating AST?

I've looked at the sources of TypeScript compiler and it seems to be ignoring whitespace when parsing the source code.

Max Koretskyi
  • 101,079
  • 60
  • 333
  • 488
  • 3
    *Automatic semicolon insertion* is the only example I can think of that makes white space have any significance in JS: https://stackoverflow.com/q/2846283/4879 – pawel Oct 20 '17 at 08:11
  • @pawel, I've found the relevant part in the spec. Only line terminators are important. I've added my own answer. – Max Koretskyi Oct 22 '17 at 15:52
  • 1
    Of course whitespace is significant: it separates tokens! Are you talking about newlines and indentation only? – Bergi Oct 22 '17 at 15:53
  • @Bergi, I'm talking about spaces and tabs. What do you mean it separates tokens? When scanner identifies tokens it doesn't use whitespace for that – Max Koretskyi Oct 22 '17 at 15:55
  • @AngularInDepth.com Of course [whitespaces are used](https://www.ecma-international.org/ecma-262/8.0/index.html#sec-white-space). They're what allow you to distinguish `newObject` from `new Object`. – Bergi Oct 22 '17 at 16:01
  • @Bergi, aha, you're right, thanks a lot for the hint! It's important for the productions that have recursive nature, like [identifiername](https://www.ecma-international.org/ecma-262/8.0/index.html#prod-IdentifierName), correct? – Max Koretskyi Oct 22 '17 at 16:07
  • 1
    I'd rather put it as "It's important for the productions that may not contain whitespaces". Whitespaces basically act as delimiters between them, as defined in the [most basic syntax definition](https://www.ecma-international.org/ecma-262/8.0/index.html#prod-InputElementDiv). – Bergi Oct 22 '17 at 16:14

1 Answers1

1

As @Bergi noted, the whitespace is important during lexical analysis which allows a scanner to know when a particular token ends. For example, this is what allow distinguishing newObject from new Object. It's important for the productions that may not contain whitespaces. For example, since space cannot be derived from IdentifierPart it marks the end of the Identifier token. Whitespace is also defined as a separate production for all goal symbols, starting with the simplest one InputElementDiv:

InputElementDiv::
    WhiteSpace

During syntactic analysis only the line-terminators are important for the automatic semicolon insertion process. The rest of the white space is not relevant. Here is the quote from the spec:

Moreover, line terminators, although not considered to be tokens, also become part of the stream of input elements and guide the process of automatic semicolon insertion (11.9). Simple white space and single-line comments are discarded and do not appear in the stream of input elements for the syntactic grammar.

Max Koretskyi
  • 101,079
  • 60
  • 333
  • 488