0

Once upon a time I was writing a C compiler in a computer science course on compilers. Part of the work involved using a C grammar in Backus Naur Form (BNF) like this. What struck me as odd was that the grammar for initializer lists allowed a list to end with a comma (so-called Dangling Comma). I tried it on my compiler, and others, and confirmed that it was allowed.

Example:

<initializer> ::= <assignment-expression>
            | { <initializer-list> }
            | { <initializer-list> , }

In C:

int array[] = { 1, 2, 3, };

My question: To this day, dangling commas are still a part of C, and other languages too, it seems. Why weren't they removed? (Assuming they don't have any "proper" function (I could be wrong), why did they propagate to other languages?)

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
mmixLinus
  • 1,646
  • 2
  • 11
  • 16
  • 12
    Why should they be removed? – Nathan Pierson Feb 14 '22 at 15:07
  • @NathanPierson Assuming a list is _comma separated_ objects, a terminal comma would be superfluous, no? – mmixLinus Feb 14 '22 at 15:09
  • 1
    Because they are not harmful and even beneficial. Even in minor things as following - consider an initializer where each element is on a new line. Then you have a change made by some developer, adding another element. With "dangling comma" the `diff` will show a single line change - as it should. Without - it will show two line change, while one line is mostly irrelevant (addition of comma). – Eugene Sh. Feb 14 '22 at 15:10
  • [Here](https://stackoverflow.com/questions/5245152/inline-property-initialisation-and-trailing-comma/5245344#5245344) is the corresponding question for C#, and the reasons are all the same. – Nate Eldredge Feb 14 '22 at 15:10
  • 1
    https://stackoverflow.com/questions/11597901/why-are-trailing-commas-allowed-in-a-list for python – KamilCuk Feb 14 '22 at 15:11
  • 1
    I guess it can be useful if dangling commas are allowed, for example for programs that generate C code. That way, the comma can always be unconditionally printed, whenever the value of an array element is printed. Otherwise, an additional `if` statement would be required. – Andreas Wenzel Feb 14 '22 at 15:11
  • 2
    Yes it is superfluous. But it simplifies the work for generating programs (scripts). And removing it would break existing programs. For instance: older (closed source) versions of Unix used it to configure & generate sources for in-kernel tables. – wildplasser Feb 14 '22 at 15:11
  • 1
    In addition to the other comments: If they'd remove the dangling commas in the next verion of the C standard, a lot of programs might not be compilable anymore. This alone is enough to keep them and unlike other stuff that has been removed from the C standard in the past (like the infamous `gets`), dangling commas as totally harmless. – Jabberwocky Feb 14 '22 at 15:15
  • 1
    It's "less superluous" than the parenthesis in `return (0);` ... about the "same superfluous" as `3 + (4 * 5)` – pmg Feb 14 '22 at 15:16
  • I understand the "it simplifies writing lists" (been there myself - a lot) and "code generators". Though @Jabberwocky's argument sounds more like _the crux of the problem.._ If a weirdness enters a grammar you simple **can't** remove it if it is already being widely used... Thanks for your comments. – mmixLinus Feb 14 '22 at 15:18
  • I had a chance in the 1990s to chat with Bjarne Stroustrup. I started citing my list of pet peeves, and Bjarne stopped me short and said "If you don't like it (C++), feel free to make your own programming language. I did." Fabulous response! – Eljay Feb 14 '22 at 15:25
  • @Eljay : ) great! – mmixLinus Feb 14 '22 at 15:27
  • Seconding (thirding, fourthing) what others have said: dangling commas are a *feature*. It used to drive me *crazy* that enums in C didn't allow them; fortunately this has been fixed. – Steve Summit Feb 14 '22 at 16:36

2 Answers2

8

Why weren't they removed?

I don't know if such removal has been considered. I don't know of a reason why such removal would be considered.

To go even further, there would be good reason to add them in case they didn't already exist. They are useful. Indeed, trailing commas have been added in standard revisions into other lists of the languages where they hadn't been allowed before. For example enumerations (C99, C++11). Furthermore, there are proposals to add them to even more lists such as the member init list. I would prefer to see them allowed for example in function calls such as they are allowed in some other languages.

A reason for the allowance of trailing comma is easier modification (less work; less chance of mistake) and clean diffs when programs are modified.

Here are some examples...

Old version without trailing comma:

int array[] = {
    1,
    2,
    3
};

New version:

int array[] = {
    1,
    2,
    3,
    4
};

Diff:

<     3
---
>     3,
>     4

Old version with trailing comma:

int array[] = {
    1,
    2,
    3,
};

New version:

int array[] = {
    1,
    2,
    3,
    4,
};

Diff:

>     4,
eerorika
  • 232,697
  • 12
  • 197
  • 326
  • 1
    I'm thinking the reason to remove them thirty/forty-odd years ago would have been different than today? (More widespread now, more code-generators today, used in many languages today, etc) – mmixLinus Feb 14 '22 at 15:25
  • 1
    Another similar advantage is that moving lines is consistent. Without trailing comma you have to update the commas if the last item changed. For example when moving `2,` to the last position. With trailing commas you don't have this issue. – 3limin4t0r Feb 14 '22 at 15:33
4

Allowing terminal commas is useful. Given some list:

int Codes[] =
{
    37,
    74,
    88,
};

then:

  • For human maintenance of lists, we can easily add or delete lines without fiddling with commas. If a terminal comma were not allowed, the appending a new line would also require editing the exiting the previous line to add a comma, and deleting the last line would also require editing the previous line to remove its comma.
  • For machine-generated lists, we do not need to include code in the loop generating the list to treat the last list item differently.
Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • I don't hold these two _opinions_ highly. Editing code is what I do **all day, every day** and whether there should be a comma after the last entry or not is (almost) irrelevant to me, and to most coders I would say. The situation occurs when you do a "linecopy-paste-paste-paste-paste etc" to create a list. – mmixLinus Feb 15 '22 at 07:48
  • Also, writing a code-generator is fraught with so many other difficulties, this must be a very small one (imho). – mmixLinus Feb 15 '22 at 07:50
  • @mmixLinus: So you claim there is little value in allowing a terminal comma. How is that in any way an argument to disallow it? No matter how small the value of allowing a terminal comma is, it exceeds the value of not allowing it, which is zero. – Eric Postpischil Feb 15 '22 at 09:20
  • "it exceeds the value of not allowing it, which is zero" - really? How about if one were to say _"a list is a comma separated ordering of objects"_ then, **by definition,** there shouldn't be a dangling comma. – mmixLinus Feb 15 '22 at 09:23
  • @mmixLinus: Why would we say that? The C standard does not say it. In some imaginary world where we said that, then we might be using misleading language, and possibly that might cause some confusion. So just do not say it. When teaching students about initializer lists, we simply teach them the existing rules. – Eric Postpischil Feb 15 '22 at 10:15
  • Yes, I understand. My pet peeve is basically one of semantics and syntax. I tend to wish for a "cleaner" definition of things, so in this case that "definition" of list I made up would be closer to how one would (could) define it _mathematically._ – mmixLinus Feb 15 '22 at 10:25