0

Context

I am writing a function which calculates some exponential value for a timer application. It simply takes 2^x until some threshold maxVal, in which case the threshold should be returned. Also, in all cases, edge cases should be accepted.

util.cpp:

#include "util.h"
#include <iostream>

int calculateExponentialBackoffDelay(int x, int maxVal)
{
    int y;
    
    if(x <= 0 || maxVal <= 0) return maxVal;
    else if (x > maxVal) return maxVal;

    y = std::pow(2, x);
    std::cout << "y = " << y << std::endl;
    
    if(y > maxVal) return maxVal;
    else if(y < 0) return maxVal;
    else return y;
}

Now I make a CMake configuration with a GoogleTest dependency fetch.

CMakeLists.txt:

cmake_minimum_required(VERSION 3.14)
project(my_project)

# GoogleTest requires at least C++14
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_BUILD_TYPE Release)

include(FetchContent)
FetchContent_Declare(
  googletest
  GIT_REPOSITORY https://github.com/google/googletest.git
  GIT_TAG release-1.12.1
)
# For Windows: Prevent overriding the parent project's compiler/linker settings
set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)
FetchContent_MakeAvailable(googletest)

enable_testing()
include_directories(
    ${CMAKE_CURRENT_LIST_DIR}/
    )
# Add the main source compilation units
add_executable(
  test_calc
  test_calc.cpp
  util.cpp
)
target_link_libraries(
  test_calc
  GTest::gtest_main
)

include(GoogleTest)
gtest_discover_tests(test_calc)

I run a test on the function that I have written which tests some boundary conditions. One of which is testing that if 2^x > maxVal, then maxVal should just be returned (because the result of the 2^x is above the maximum value). This is the threshold.

test_calc.cpp:

#include "util.h"
#include <climits>
#include <gtest/gtest.h>

TEST(util_tests, ExponentialBackoff)
{
    int x, maxVal, res;

    // Test x = maxVal and maxVal = 1000
    // Expected output: 1000
    maxVal = 1000;
    x = maxVal;
    EXPECT_EQ(maxVal, calculateExponentialBackoffDelay(x, maxVal));
}

When I set x and maxVal to 1000, 2^(1000) is calculated and because it's such a big number, there is an overflow / wraparound to a really negative number (-2147483`648). That is expected, and therefore my test will expect the result of x=1000, maxVal=1000 to be < 0.

Problem

This is where things go unexpectedly. I run ctest inside my build directory and all test cases pass. Then I change one line in CMakeLists.txt from ... :

set(CMAKE_BUILD_TYPE Debug)

... to ...

set(CMAKE_BUILD_TYPE Release)

... and that test case fails:

1: Expected equality of these values:
1:   maxVal
1:     Which is: 1000
1:   calculateExponentialBackoffDelay(x, maxVal)
1:     Which is: -2147483648

So for some reason, the case where y < 0 inside the function body is not being reached and instead, we are returning the wraparound result.

Why is this? What am I doing wrong? I tried using the Linux strip -s test_calc to check if it was a symbols thing while keeping the debug configuration in CMake, only to find that test cases still pass. What else does CMake do to change the comparison behaviour of the resulting binary?

Phippsy
  • 62
  • 7
  • 2
    Technically the signed integer overflow is [undefined behavior](https://en.cppreference.com/w/cpp/language/ub) , so maybe that's why you got weird results. Try to work with `double y;` to hold the `std::pow` result, then check if it is not greater than max, then cast it back to int. – pptaszni Aug 22 '22 at 15:22
  • Imho `CMAKE_BUILD_TYPE` shouldn't be set inside the `CMakeLists.txt` file; the value should be provided by the user via `-D` option during configuration or equivalent. Otherwise you restrict the uses of your project. It definetly shouldn't be set after the first `project()` command. Also why return `maxVal` for `x = 0`? `2^0 = 1`. But why not implement this as (`return ((x < 0 || x >= (sizeof(int)*8 - 1))) ? maxVal : std::min(1 << x, maxVal);`) – fabian Aug 22 '22 at 17:13

2 Answers2

2

Floating-integral conversions

  • A prvalue of floating-point type can be converted to a prvalue of any integer type. The fractional part is truncated, that is, the fractional part is discarded. If the value cannot fit into the destination type, the behavior is undefined. - This is it.

The behavior is undefined. Any expectations for results of int y = std::pow(2, x) with x > 31 is considered to be invalid and may lead to any result of the function calculateExponentialBackoffDelay.

In this particular case, compilers know y = std::pow(2, x) is always greater than 0 for valid values of x and drop the branch if (y < 0) off.

273K
  • 29,503
  • 10
  • 41
  • 64
  • This [famous series of blog posts](https://blog.regehr.org/archives/213) (John Reguhr) is just one of many fine writeups available on what problems undefined behavior can cause in optimized (and unoptimized!) code - and what things lead to undefined behavior. And thus on what's different between Debug and Release builds. – davidbak Aug 22 '22 at 15:45
  • @Tsyvarev https://stackoverflow.com/questions/16188263/is-signed-integer-overflow-still-undefined-behavior-in-c: *If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the **behavior is undefined**.* Conversion is not evaluation. – 273K Aug 22 '22 at 15:46
  • Signed overflow is related to signed integral types, right? This is truncation - 2¹⁰⁰⁰ does not _fit_ in an `int` (or `long int` or `long long int`!) on any machine you're likely to be using. Therefore: UB. – davidbak Aug 22 '22 at 16:56
  • @davidbak Out-of-range cast to `int` is legal since C++20, and was implementation-defined until C++20. – 273K Aug 22 '22 at 17:03
  • @Tsyvarev *compiler optimization may NOT remove branches occurred due to implementation defined behavior* They are allowed to remove branches if they discover consexpr conditions regardless UB. They discover `(y = std::pow(2, x)) >= 0` and remove the branch if `(y < 0)`. – 273K Aug 22 '22 at 17:08
  • @273K - the result of `pow` is a floating point type, right? (E.g., [here](https://en.cppreference.com/w/cpp/numeric/math/pow)). Therefore this is a [floating-integral conversion](https://en.cppreference.com/w/cpp/language/implicit_conversion#Integral_conversions). Not an integral conversion. Right? (Python has a `pow` that takes integers and returns integers but not C++.) – davidbak Aug 22 '22 at 17:30
  • BTW I did _not_ know that the out-of-range thing is now legal C++. Do you have a reference to one of the wg21 papers on that? I'd like to read the rationale - i.e., what made them change it. Perhaps something to do with constexpr? It is certainly listed as a C++20 change in that implicit conversion page on cppreference we've been linking but the list of defects at the bottom doesn't seem to have it. – davidbak Aug 22 '22 at 17:41
  • One more tidbit - now that I look at `pow(int,int)` for this case it will return [`HUGE_VAL`](https://en.cppreference.com/w/cpp/numeric/math/HUGE_VAL) which is going to be (most machines with IEEE) +∞. Doesn't change anything in this case (UB I assume when converting +∞ to int) but may cause _other_ strangeness the programmer isn't expecting if used in a floating-point context (because frequently arithmetic with IEEE infinities is not expected by the programmer...) (And _especially_ in a unit testing context!!) – davidbak Aug 22 '22 at 17:49
  • The standard is here [Expressions - Standard conversions - Integral conversions](https://eel.is/c++draft/conv.integral#:conversion,integral). The proposal needs more time. – 273K Aug 22 '22 at 17:51
  • -1, the quote here applies to float-to-float conversions, not float-to-int. The latter is described in the [next paragraph](https://en.cppreference.com/w/cpp/language/implicit_conversion#Floating.E2.80.93integral_conversions). Although the conclusion that there is UB is correct. – Yksisarvinen Aug 22 '22 at 19:06
  • @273K I didn't mean the link, I meant the quote in your answer, on which it is built. I'd have to edit like half of your answer to replace it with correct one, seems like too big ingerention for me. – Yksisarvinen Aug 22 '22 at 21:54
  • No worries, removed my vote now :) – Yksisarvinen Aug 22 '22 at 21:56
1

When changing build type changes result, your first thought should always be "Undefined Behaviour somewhere in the code". And this is indeed the case. Converting double (result of pow) to integer type with value that is out of range for that integer type is Undefined Behaviour.

Quote from cppreference:

A finite value of any real floating type can be implicitly converted to any integer type. Except where covered by boolean conversion above, the rules are:

  • The fractional part is discarded (truncated towards zero).
    • If the resulting value can be represented by the target type, that value is used
    • otherwise, the behavior is undefined

2 to power 1000 is roughly 1e301, which is well beyond any possible range for int type, so converting it in y = std::pow(2, x); is definitely an UB. And UB on modern compilers is very hard to reason about.


Here's one attempt at reasoning it out anyway:

  1. Compilers can optimize code based on assumption that there is no UB in the code.
  2. Unless there is UB, the only valid outcomes for y = std::pow(2, x); are positive integers (because we get exponent of positive integer, it can only be 0 or positive)
  3. There can be no UB in the program (point 1), so condition if (y < 0) is always false (point 2) and can be optimized away.

But it's just my guess at what happens, it may be correct or completely wrong. Compiler is allowed to do absolutely anything with code that contains any UB at all.

Yksisarvinen
  • 18,008
  • 2
  • 24
  • 52
  • It would be good to know the answer to all my questions in the ultimate paragraph, especiially "What else does CMake do to change the comparison behaviour of the resulting binary?". Just because it is undefined behaviour, doesn't mean I still do not get the same results when simply changing `Debug` to `Release`, and I would like to know what is happening to change that behaviour. – Phippsy Aug 22 '22 at 15:40
  • 1
    @273K There is no overflow. UB happens at the point of conversion, because conversion from floating point type to integer type is only defined if the result value can be represented using target type ([cppreference](https://en.cppreference.com/w/c/language/conversion#Real_floating-integer_conversions)) – Yksisarvinen Aug 22 '22 at 18:41
  • 1
    @Phippsy Undefined Behaviour is undefined. Explaining UB on modern compilers is about as good as reading tea leaves. Compiler can optimize a lot of things based on the assumption that your program has no UB. For example, it can skip `if (y < 0)` entirely, because there is no way to have that condition `true` without invoking UB on double-to-int conversion (which *may* explain your results). A good article on how unpredictable UB became with modern compilers: [Undefined behavior can result in time travel](https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=633) – Yksisarvinen Aug 22 '22 at 18:54