1

Possible Duplicate:
What are the rules for Javascript's automatic semicolon insertion?

JavaScript befuddles me with its implicit line termination. It's a very C-like language, except that ending lines in a semi-colon is often optional.

So how does it decide when to assume an end-of-line?

Consider this example:

var x = 1 + 2
-3 + 3 == 0 ? alert('0') : alert('3')

Punching that into an HTML file and opening it in Safari popped up 3. If you stick a semicolon on the end of the first line, it changes to 0.

The algorithms and logic are all straightforward; what interests me is by what criteria JavaScript decided, in this instance, not to assume an end-of-line after the first line. Is it that it only waits for an error scenario before assuming an EOL? Or is there a more definite criteria?

I'm very curious. I haven't researched this much; I want to see what the S/O community has to say about it. I always end my lines with semicolons anyway, but I have some JS compression code that trips on the semicolon issue from time to time when I inadvertently leave one out.

Edit

OK just to clarify what the actual question is here: Can anybody describe, in non-abstract terms, when JavaScript will and won't automatically insert semicolons.

This is NOT a duplicate. I'm aware that the rules for automatic semicolon insertion are well established and concisely documented. They're also long winded and confusing because they are generally abstract. In my experience, high level programmers don't digest low level documentation as well as simple end results, which is what I'm looking for.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Neil
  • 3,001
  • 2
  • 16
  • 20
  • 2
    http://inimino.org/~inimino/blog/javascript_semicolons – Quentin Jun 06 '11 at 12:37
  • Thanks Quentin; however, I'm trying to encourage discussion, not sprout a linkfest :) – Neil Jun 06 '11 at 12:40
  • 2
    Neil, this site is *not* a forum for discussion, but for questions that can be answered. Moreover, what do you want to discuss about semicolon insertion? Whether the ECMAScript people are eggheads? Whether minimizers should insert them as well? – Marcel Korpel Jun 06 '11 at 12:42
  • What's to discuss? The specification is clear enough (and that document explains it more clearly). – Quentin Jun 06 '11 at 12:43
  • OK perhaps I need to clarify; most existing documentation is fairly abstract, ie talks about valid/invalid tokens rather than language specifics. I'd like to boil it down to more immediately identifiable rules, if that is actually possible. For example, I like the summary in Quentin's link which refers to parenthesis, square brackets and math operators. THAT is more useful information in my opinion. Marcel, hopefully this answers your question. – Neil Jun 06 '11 at 12:51
  • Ok, that sound more viable than your former statement, but it *is* a complex topic. See CMS' answer in the linked question for a description, it's just that way. – Marcel Korpel Jun 06 '11 at 12:59
  • Come on fellas. It covers some similar ground but certainly isn't a duplicate. Some of us think in different ways and need to approach issues from different angles. Help me reopen this baby so I can finish it off. Ta. – Neil Jun 06 '11 at 22:54

2 Answers2

7

The ECMA specification (ch. 7.9.1, page 26) states:

There are three basic rules of semicolon insertion:

  1. When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
    • The offending token is separated from the previous token by at least one LineTerminator.
    • The offending token is }.
  2. When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
  3. When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation “[no LineTerminator here]” within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.

I think this implementation has to do with the second point where:

var x = 1 + 2
-3 + 3 == 0 ? alert('0') : alert('3')

can be parsed as a single complete ECMAScript Program

Because it's not always clear how the parser will insert semi-colons, it's advisable to not leave it to the parser (i.e. always insert the semi-colons yourself).

In ch. 7.9.2 (Examples of Automatic Semicolon Insertion) of the same specs this example looks like your situation:

The source

a = b + c  
(d + e).print()

is not transformed by automatic semicolon insertion, because the parenthesised expression that begins the second line can be interpreted as an argument list for a function call:
a = b + c(d + e).print()

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
KooiInc
  • 119,216
  • 31
  • 141
  • 177
  • Cheers, that's good to know. I guess I'm trying to boil this down so I can code a JS compressor that doesn't need to parse everything and tokenise it, but instead deal with simple regex'es and patterns. – Neil Jun 06 '11 at 12:57
4

Don't leave it to the compiler, always put semicolons at the end of statements.

RobG
  • 142,382
  • 31
  • 172
  • 209
  • 1
    Rob, I think that's a valid position, but it defeats the purpose of my question. Thanks anyway. – Neil Jun 06 '11 at 12:52
  • RobG is totally correct, the smart thing to do is to form your code in such a way where ASI is never a problem. – Eli Jun 06 '11 at 13:09
  • 5
    Eli, if I were to say the world is grey, I'd also be correct. But it's no good if I want to know exactly which shade. I'm trying to write a kind of JS parser, and the reality is, people frequently DON'T do what's best, but that's not a good reason to let their code crash and burn. – Neil Jun 06 '11 at 22:47