Compiler remove a instruction wrongly with -O2 optimization, is any undefined behaviour in this code?

Question

I am working on a function that can convert long binary number to decimal string. I store those binary numbers in a uint32_t array, little-endian. For example, if an array is like uint32_t a[2]{0,1}, it represents 4294967296 (2^32).

Code:

#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <cstring>


char* to_string(uint32_t* data,int size)  {
        if (size == 0) {
            return new char[2]{ '0','\0' };
        }

        int decimal_size = 10 * size;
        char* decimal_res = new char[decimal_size];
        for (int i = 0; i < decimal_size; i++) {
            decimal_res[i] = 0;
        }

        int rank = 0;
        uint64_t* data_ptr = new uint64_t[size]; 
        for (int i = 0; i < size; i++) {
            data_ptr[i] = data[i];

        }


        int current_index = size - 1;
        while (current_index >= 0) {

            char remainder = 0;
            if (data_ptr[current_index] < 10) {
                remainder = data_ptr[current_index];
                current_index--;
            }

            //div
            for (int i = current_index; i >= 0; i--) {

                ((uint32_t*)&data_ptr[i])[1] = remainder;

                uint64_t num = data_ptr[i];

                //printf("num:%ld\n", num);

                data_ptr[i] = num / 10;


                remainder = num - data_ptr[i] * 10;

            }
            if (rank < decimal_size) {
                decimal_res[rank] = remainder;
            }
            else {
                abort();
            }

            rank++;
        }


        delete[] data_ptr;

        while (!decimal_res[rank - 1] && rank > 0) {
            rank--;
        }
        if (rank == 0) {
            return new char[2]{ '0','\0' };
            delete[] decimal_res;
        }


        char* real_res = new char[rank + 1];
        real_res[rank] = '\0';
        for (int i = 0; i < rank; i++) {
            real_res[i] = decimal_res[rank - i - 1] + '0';
        }

        delete[] decimal_res;
        return real_res;
    }

int main(){
    int size=0;
    scanf("%d", &size);
    uint32_t* raw_data_ptr = new uint32_t[size];
    for(int i=0;i<size;i++){
        scanf("%ud,",&raw_data_ptr[i]);
    }


    char * res = to_string(raw_data_ptr,size);
    printf("%s",res);
    delete[] res;
    return 0;
            
}

Basically, in each loop, I divide this large binary number by 10 and get the remainder. This is working perfectly without -O2 optimization. If I input 2 0 1, it outputs 4294967296. But with -O2 optimization, it outputs 0.

But if I insert a printf instruction inside the inner loop, with -O2 optimization, it works well. So I looked into the assembly code the compiler generated, it seems that the remove some code I wrote.

Besides, if I replace ((uint32_t*)&data_ptr[i])[1] = remainder; with ((uint32_t*)data_ptr)[2 * i + 1] = remainder;, it also works well.

So is it a bug of compiler? Or my code has some undefined behaviours?

compiler version:

g++ (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE

system: Ubuntu 20.04

Do you know [what is the strict aliasing rule?](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) — JaMiT, Sep 17 '21 at 13:19
you are using the wrong format specifier: https://godbolt.org/z/vfE554jT7 — 463035818_is_not_an_ai, Sep 17 '21 at 13:20
sorry about that, it is correct now.@463035818_is_not_a_number — callFromfuture, Sep 17 '21 at 13:20
thanks a lot, I have changed the code. But the error remains.@463035818_is_not_a_number — callFromfuture, Sep 17 '21 at 13:23
I think you are right, it is about strict aliasing rule. @JaMiT — callFromfuture, Sep 17 '21 at 13:27
@callFromfuture: It's too bad the authors of C89 never explicitly stated within the Standard itself that situations that invoke UB include the use of constructs that are non-portable *but correct*, and that support for non-portable programs was intended to be treated as a quality-of-implementation issue outside the Standard's jurisdiction. There was never any reason for quality compilers not to support type punning in cases whose behavior would be defined by the underlying platform, and where cross-type pointer derivation would be obvious to any compiler that made any effort to notice it. — supercat, Sep 17 '21 at 19:15

Compiler remove a instruction wrongly with -O2 optimization, is any undefined behaviour in this code?

0 Answers0