Context
I have a char
variable on which I need to apply a transformation (for example, add an offset). The result of the transformation may or may not overflow.
I don't really care of the actual value of the variable after the transformation is performed.
The only guarantee I want to have is that I must be able to retrieve the original value if I perform the transformation again but in the opposite way (for example, substract the offset).
Basically:
char a = 42;
a += 140; // overflows (undefined behaviour)
a -= 140; // must be equal to 42
Problem
I know that signed
types overflow is undefined behaviour but it's not the case for unsigned
types overflows. I have then chosen to add an intermediate step in the process to perform the conversion.
It would then become:
char
->unsigned char
conversion- Apply the tranformation (resp. the reversed transformation)
unsigned char
->char
conversion
This way, I have the garantee that the potential overflow will only occur for an unsigned
type.
Question
My question is, what is the proper way to perform such a conversion ?
Three possibilities come in my mind. I can either:
- implicit conversion
static_cast
reinterpret_cast
Which one is valid (not undefined behaviour) ? Which one should I use (correct behaviour) ?
My guess is that I need to use reinterpret_cast
since I don't care of actual value, the only guarantee I want is that the value in memory remains the same (i.e. the bits don't change) so that it can be reversible.
On the other hand, I'm not sure if the implicit conversion or the static_cast
won't trigger undefined behaviour in the case where the value is not representable in the destination type (out of range).
I couldn't find anything explicitly stating it is or is not undefined behaviour, I just found this Microsoft documentation where they did it with implicit conversions without any mention of undefined behaviour.
Here is an example, to illustrate:
char a = -4; // out of unsigned char range
unsigned char b1 = a; // (A)
unsigned char b2 = static_cast<unsigned char>(a); // (B)
unsigned char b3 = reinterpret_cast<unsigned char&>(a); // (C)
std::cout << (b1 == b2 && b2 == b3) << '\n';
unsigned char c = 252; // out of (signed) char range
char d1 = c; // (A')
char d2 = static_cast<char>(c); // (B')
char d3 = reinterpret_cast<char&>(c); // (C')
std::cout << (d1 == d2 && d2 == d3) << '\n';
The output is:
true
true
Unless undefined behaviour is triggered, the three methods seem to work.
Are (A) and (B) (resp. (A') and (B')) undefined behaviour if the value is not representable in the destination type ?
Is (C) (resp. (C')) well defined ?