1

Does C really need two single quotes (apostrophes) to delimit char literals instead of just one?

For string literals we do need to delimit the start and the end since strings vary in length, but it seems to me that we do know how long a char literal will be: either a single character (in the source), two characters if it is a regular character escape (prefix \0), five characters if it is an octal literal (prefix \0[0-7]), etc.

Keep in mind that I am looking for a technical answer, not a historical one. Does it make parsing simpler? Did it make parsing simpler on 70s hardware? Does it allow for better parsing error messages? Things like that.

(The same question could be asked for most C syntax inspired languages since most of them seem to use the same syntax to delimit char literals. I think the Jai programming language might be an exception since I seem to recall that it just uses a single question mark (at the beginning), but I’m not certain.)

Some examples:

  • 'G'
  • '\0'
  • '\0723'

Would it work if we just used a single quote at the start of the token?

  • 'G
  • '\0
  • '\0723

Could we in principle parse these tokens the same way without complicating the grammar?

We see that the null byte literal and the octal literal have the same prefix, but there might not be any ambiguity since there might not be any way that '\0 followed immediately by 723 might be anything else than a char literal (at least to my mind). And if there is an ambiguity then the null byte literal could become \z instead.

Are the two single quotes needed in order to properly parse char literals?

Guildenstern
  • 2,179
  • 1
  • 17
  • 39

3 Answers3

8

cppreference.com says that multicharacter constants were inherited to C already from the B programming language, so probably have existed from the start. Since they can be of various widths, the ending quote is pretty much a requirement.

Apart from that and aesthetics in general, a character constant representing the space character in particular would look somewhat awkward and be a likely magnet for mistakes if it was just ' instead of ' '.

ilkkachu
  • 6,221
  • 16
  • 30
3

One answer (there might be more) might be that C99 supports multicharacter literals. See for example this SO question.

So for example 'left' is a valid (multi) char literal.

Once you have multichar literals you might not be able to just use a single quotation marks to delimit char literals. For example, how would you delimit the literal 'a c' with just one single quotation mark?

The meaning of such literals is implementation defined so I don’t know how widely-supported this feature is.

Guildenstern
  • 2,179
  • 1
  • 17
  • 39
2

Why does C use two single quotes to delimit char literals instead of just one?

Because several historical predecessors of C (e.g. PL/1, and B and some dialects of Fortran or ALGOL) did so.

And because the C standard (e.g. n1570 or something newer) specifies that.

And perhaps because in the 1970s it was faster to parse (for most char literals like 'z' ....)

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547