The canonical example for mixed signed and unsigned arithmetic seems to be:
unsigned int u = 10;
int a = -42;
auto tmp1 = u - a;
std::cout << tmp1 << std::endl;
int tmp2 = tmp1;
std::cout << tmp2 << std::endl;
(Note that I am using auto tmp1
so I can show the intermediate result and inspect the type. In the real code, this is all one line , of course)
if you enable enough warnings (-Wconversion
on clang-13 and, additionally, -Wsign-conversion
on gcc-11) the compiler complains nicely:
main.cpp:148:21: warning: conversion to ‘unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]
148 | auto tmp1 = u - a;
| ^
main.cpp:150:16: warning: conversion to ‘int’ from ‘unsigned int’ may change the sign of the result [-Wsign-conversion]
150 | int tmp2 = tmp1;
but at least the "answer is correct":
output
52
52
It's nicely explained, why this still "sort of works", here: https://stackoverflow.com/a/25609711/1087626
Of course the general, and correct, advice is: "Don't mixed signed and unsigned types".
However, recently, in generic code, where the user is choosing the types, I came across this simplified example:
unsigned int count = 10;
int sum = -100;
auto avg1 = sum / count;
std::cout << avg1 << std::endl;
int avg2 = avg1;
std::cout << avg2 << std::endl;
The compiler still complains nicely if you have the right warnings enabled:
main.cpp:155:17: warning: conversion to ‘unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]
155 | auto avg1 = sum / count;
| ^~~
main.cpp:157:16: warning: conversion to ‘int’ from ‘unsigned int’ may change the sign of the result [-Wsign-conversion]
157 | int avg2 = avg1;
and the answer is catastrophically wrong:
output
429496719
429496719
In the real code, this was running in a loop where avg
was feeding back into sum
. So sum
then quickly overflowed. Being signed integer overflow this was undefined behaviour. My UBSan dutifully kicked off, and butterflies flew out of the computer to make me lunch.
What is a safe way to solve this in generic code, when the user chooses the types?
- The obvious
auto avg1 = sum / static_cast<int>(count);
?? Works, no more UB, correct answer, and no warnings. But seems like a cludge / sledge hammer, and could be hiding other issues? - Restrict the types chosen by the user (using concepts or similar) such that
sum
,count
andavg
always have the same "signedness".
Or something else entirely ...?
EDIT
For clarity, here is what a toy template might look like, and how the user might get caught out.
#include <iostream>
template <typename Value, typename Sum, typename Count>
Value calc_average(Sum sum, Count cnt) {
return sum / cnt;
}
int main() {
// All good
std::cout << calc_average<int, long, unsigned>(-100, 10) << "\n";
// Oh, oh, Butterflies!!!
std::cout << calc_average<int, int, unsigned>(-100, 10) << "\n";
return EXIT_SUCCESS;
}
Output
-10
429496719
For those interested in the longer motivating example, it is all spelled out in this codereview: https://codereview.stackexchange.com/questions/274171/generic-exponential-damping-to-smooth-noisy-signals