9
int main()
{
    0xD-0; // Fine
    0xE-0; // Fails
}

This second line fails to compile on both clang and gcc. Any other hex constant ending is ok (0-9, A-D, F).

Error:

<source>:4:5: error: unable to find numeric literal operator 'operator""-0'
    4 |     0xE-0;
      |     ^~~~~

I have a fix (adding a space after the constant and before the subtraction), so I'd mainly like to know why? Is this something to do with it thinking there's an exponent here?

https://godbolt.org/z/MhGT33PYP

Mike Vine
  • 9,468
  • 25
  • 44
  • 2
    Edit: I've found https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3885 which may explain this – Mike Vine Jul 08 '22 at 09:38
  • A parsing error worth reporting, but I doubt you really need to subtract two literals, it would typically go like `constexpr int BASE = 0xE; BASE-0;` anyway – Alexey S. Larionov Jul 08 '22 at 09:40
  • I'm guessing it has to do with how C++ interprets `E+...` or `E-...`. It must be first interpreting `-0` as part of the number due to the `E` before it, then realising that its an invalid suffix in this case (as it's not an integer suffix at all). Just a guess, though. I'd love to find out why this happens. – Ryan Zhang Jul 08 '22 at 09:40
  • Yeah - I guess it's confused about whether or not to treat the `E-0` as a floating-point exponent. As mentioned in the linked bug report, adding spaces around the `-` sign resolves the issue, it seems. – Adrian Mole Jul 08 '22 at 09:42
  • After digging around in the docs, I've found out why. I'll add an answer now. – Ryan Zhang Jul 08 '22 at 09:43
  • Note that clang-cl (in VS 2022) has no issue with this. – Adrian Mole Jul 08 '22 at 09:46
  • @AdrianMole a bug on Clang-cl behalf if its accepting it. As in the reported bug it appears to be "no bug". – Tony Tannous Jul 08 '22 at 09:58
  • Existing answer to same problem: https://stackoverflow.com/questions/49543516/why-doesnt-0xe1-compile – Vicky Jul 08 '22 at 10:06
  • 2
    Also see: https://en.cppreference.com/w/cpp/language/integer_literal which says: `Due to maximal munch, hexadecimal integer literals ending in e and E, when followed by the operators + or -, must be separated from the operator with whitespace or parentheses in the source:` – Mike Vine Jul 08 '22 at 10:49

1 Answers1

8

Actually, this behaviour is mandated by the C++ standard (and documented), as strange as it may seem. This is because of how C++ compiles using Preprocessing Tokens (a.k.a pp-tokens).

If we look closely at how the compiler generates a token for numbers:

A preprocessing number is made up of a digit, optionally preceded by a period, and may be followed by letters, underscores, digits, periods, and any one of: e+ e- E+ E-.

According to this, The compiler reads 0x, then E-, which it interprets it as part of the number as having E- is allowed in a numeral pp-token and no space precedes it or is in between the E and the - (this is why adding a space is an easy fix).

This means that 0xE-0 is taken in as a single preprocessing token. In other words, the compiler interprets it as one number, instead of two numbers 0xE and 0 and an operation -. Therefore, the compiler is expecting E to represent an exponent for a floating-point literal.

Now let's take a look at how C++ interprets floating-point literals. Look at the section under "Examples". It gives this curious code sample:

    << "\n0x1p5         " << 0x1p5        // double
    << "\n0x1e5         " << 0x1e5        // integer literal, not floating-point

E is interpreted as part of the integer literal, and does not make the number a hexadecimal floating literal! Therefore, the compiler recognizes 0xE as a single, integer, hexidecimal number. Then, there is the -0 which is technically part of the same preprocessing token and therefore is not an operator and another integer. Uh oh. This is now invalid, as there is no -0 suffix.

And so the compiler reports an error, as such.

Ryan Zhang
  • 1,856
  • 9
  • 19