21

Many programming languages allow trailing commas in their grammar following the last item in a list. Supposedly this was done to simplify automatic code generation, which is understandable.

As an example, the following is a perfectly legal array initialization in Java (JLS 10.6 Array Initializers):

int[] a = { 1, 2, 3, };

I'm curious if anyone knows which language was first to allow trailing commas such as these. Apparently C had it as far back as 1985.

Also, if anybody knows other grammar "peculiarities" of modern programming languages, I'd be very interested in hearing about those also. I read that Perl and Python for example are even more liberal in allowing trailing commas in other parts of their grammar.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • If I recall, when I learned C from the original K & R (pre-ANSI) in 1983, trailing commas were allowed. This was intentional to allow easier automatic code generation (by tools like YACC). – Ralph Sep 10 '11 at 13:05
  • 2
    It also helps when (as in an `enum` definition) each item is on its own line: adding a new value at the end is simply an additional line (not requiring a change of the newly penultimate line), potentially avoiding a dependency on/by any changes to the newly penultimate line. – jhfrontz Feb 15 '18 at 19:38

4 Answers4

4

I just found out that a g77 Fortran compiler has the -fugly-comma Ugly Null Arguments flag, though it's a bit different (and as the name implies, rather ugly).

The -fugly-comma option enables use of a single trailing comma to mean “pass an extra trailing null argument” in a list of actual arguments to an external procedure, and use of an empty list of arguments to such a procedure to mean “pass a single null argument”.

For example, CALL FOO(,) means “pass two null arguments”, rather than “pass one null argument”. Also, CALL BAR() means “pass one null argument”.

I'm not sure which version of the language this first appeared in, though.

Community
  • 1
  • 1
polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
4

I'm not an expert on the commas, but I know that standard Pascal was very persnickity about semi-colons being statement separators, not terminators. That meant you had to be very very careful about where you put one if you didn't want to get yelled at by the compiler.

Later Pascal-esque languages (C, Modula-2, Ada, etc.) had their standards written to accept the odd extra semicolon without behaving like you'd just peed in the cake mix.

T.E.D.
  • 44,016
  • 10
  • 73
  • 134
  • 1
    Speaking of semicolons, I'd be interested to see a chart on the usage frequency of semicolons in human history. According to Wikipedia, the earliest general use is in 1591. I suspect that there's a jump in its usage with every new curly-bracket style programming language invented. There's probably also a bump when people figure out that you can use it to wink at people `;)` – polygenelubricants Feb 22 '10 at 17:38
  • 6
    Seriously; I don't think it has altered the frequency I use them; – Aiden Bell Feb 22 '10 at 19:15
  • It's interesting to note that in BASIC, the colon is used much more often than the semicolon, and that on the Commodore Vic-20 and derivatives (C64, C128, etc.) the middle row ends with J, K, L, colon (rather than J, K, L, semicolon). – supercat Oct 27 '11 at 00:48
  • You can find the background for C [here](http://stackoverflow.com/a/29099924/1708801) – Shafik Yaghmour Mar 18 '15 at 13:54
2

[Does anybody know] other grammar "peculiarities" of modern programming languages?

One of my favorites, Modula-3, was designed in 1990 with Niklaus Wirth's blessing as the then-latest language in the "Pascal family". Does anyone else remember those awful fights about where semicolon should be a separator or a terminator? In Modula-3, the choice is yours! The EBNF for a sequence of statements is

stmt ::= BEGIN [stmt {; stmt} [;]] END

Similarly, when writing alternatives in a CASE statement, Modula-3 let you use the vertical bar | as either a separator or a prefix. So you could write

CASE c OF
| 'a', 'e', 'i', 'o', 'u' => RETURN Char.Vowel
| 'y' => RETURN Char.Semivowel
ELSE RETURN Char.Consonant
END

or you could leave off the initial bar, perhaps because you prefer to write OF in that position.

I think what I liked as much as the design itself was the designers' awareness that there was a religious war going on and their persistence in finding a way to support both sides. Let the programmer choose!


P.S. Objective Caml allows permissive use of | in case expressions whereas the earlier and closely related dialect Standard ML does not. As a result, case expressions are often uglier in Standard ML code.


EDIT: After seeing T.E.D.'s answer I checked the Modula-2 grammar and he's correct, Modula-2 also supported semicolon as terminator, but through the device of the empty statement, which makes stuff like

x := x + 1;;;;;; RETURN x

legal. I suppose that's not a bad thing. Modula-2 didn't allow flexible use of the case separator |, however; that seems to have originated with Modula-3.

Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
  • That's right. That's also how C got around the issue, I believe. Ada's solution is closer to Modula-3's, so that ugly-looking multi semicolon thing would not be legal. If you want an "empty statement" for some weird reason in Ada, you have to say `null;` – T.E.D. Feb 23 '10 at 13:46
  • Using the semicolon as an empty statement is not legal in Wirth's classic Modula-2, it was added in ISO Modula-2 after much debate, not everyone was happy with it. Modula-2 R10 sticks with Wirth's approach but adds a built-in dummy procedure called TODO which takes a character literal as argument which is printed as a compile time warning in DEBUG mode, or an error in production build mode. – trijezdci Sep 29 '15 at 16:07
2

Something which has always galled me about C is that although it allows an extra trailing comma in an intializer list, it does not allow an extra trailing comma in an enumerator list (for defining the literals of an enumeration type). This little inconsistency has bitten me in the ass more times than I care to admit. And for no reason!

Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
  • 1
    I think you're backward on that. I think the rationale was that it the presence of an extra item at the end of an enum definition won't affect anything (unless its name collides with another identifier) whereas adding an extra item at the end of an initializer could affect the size of the allocated array. – supercat Oct 27 '11 at 00:50
  • 2
    Trailing comma in enum lists was fixed in the C99 standard, which was released 11 years before this answer was written. It's a known and fixed language bug. – Lundin Jun 30 '20 at 09:39
  • Meanwhile C++ allows trailing commas for both initializer lists and enum lists, but inconsistently, not for constructor member initializer lists (as of C++20 anyway). – Dwayne Robinson May 11 '21 at 01:57
  • So they are now allowed @Lundin ? – user129393192 Jun 12 '23 at 08:07
  • @user129393192 Yes since the year 1999. This answer is wrong. – Lundin Jun 12 '23 at 08:39