Trigraphs dropped
Source files are encoded in a physical character set that is mapped in an implementation-defined way to the source character set, which is defined in the standard. To accommodate mappings from some physical character sets that didn't natively have all of the punctuation needed by the source character set, the language defined trigraphs—sequences of three common characters that could be used in place of a less common punctuation character. The preprocessor and compiler were required to handle these.
In C++17, trigraphs were removed. So some source files will not be accepted by newer compilers unless they are first translated from the physical character set to some other physical character set that maps one-to-one to the source character set. (In practice, most compilers just made interpretation of trigraphs optional.) This isn't a subtle behavior change, but a breaking change the prevents previously-acceptable source files from being compiled without an external translation process.
More constraints on char
The standard also refers to the execution character set, which is implementation defined, but must contain at least the entire source character set plus a small number of control codes.
The C++ standard defined char
as a possibly-unsigned integral type that can efficiently represent every value in the execution character set. With the representation from a language lawyer, you can argue that a char
has to be at least 8 bits.
If your implementation uses an unsigned value for char
, then you know it can range from 0 to 255, and is thus suitable for storing every possible byte value.
But if your implementation uses a signed value, it has options.
Most would use two's complement, giving char
a minimum range of -128 to 127. That's 256 unique values.
But another option was sign+magnitude, where one bit is reserved to indicate whether the number is negative and the other seven bits indicate the magnitude. That would give char
a range of -127 to 127, which is only 255 unique values. (Because you lose one useful bit combination to represent -0.)
I'm not sure the committee ever explicitly designated this as a defect, but it was because you couldn't rely on the standard to guarantee a round-trip from unsigned char
to char
and back would preserve the original value. (In practice, all implementations did because they all used two's complement for signed integral types.)
Only recently (C++17?) was the wording fixed to ensure round-tripping. That fix, along with all the other requirements on char
, effectively mandates two's complement for signed char
without saying so explicitly (even as the standard continues to allow sign+magnitude representations for other signed integral types). There's a proposal out to require all signed integral types use two's complement, but I don't recall whether it made it into C++20.
So this one is sort of the opposite of what you're looking for because it gives previously incorrect overly presumptuous code a retroactive fix.