1

I am making an os and have booted into a 64 bit kernel made in c. I have made a print function which is working and am trying to make a function to convert hex values to string so I can print them. My code is causing boot loops, yet when I compile the exact same code to run normally in linux it works perfectly. The relevant code:

int logarithm(double value, int base, int* output) {
    int i = 0;

    while (value > 1) {
        value /= base;
        i++;
    }

    *output = i;
}

int power(int value, int power, int* output) {
    if (power == 0) {
        value = 1;
    } else {
        for (int i = 0; i < (power - 1); i++) {
            value *= value;
        }
    }

    *output = value;
}

void hexToStr(unsigned int hex, char** string) {
    int hexLength = 0;
    logarithm((double)hex, 16, &hexLength);
    char output[hexLength];
    output[hexLength] = 0;

    int powerValue = 0;
    for (int i = 0; i < hexLength; i++) {
        power(16, i, &powerValue);
        output[hexLength - i - 1] = (hex & (powerValue * 15)) / powerValue + '0';
    }

    *string = output;
}

If I change the hexToStr() function code to this (removing the need for logarithm() and power() functions by hardcoding values for the string), it works in both linux and my kernel:

void hexToStr(unsigned int hex, char** string) {
    int hexLength = 10;
    char output[hexLength];
    output[hexLength] = 0;

    int powerValue = 0;
    for (int i = 0; i < hexLength; i++) {
        output[hexLength - i - 1] = 'A';
    }

    *string = output;
}

Any suggestions as to why this would happen?

  • Followup question with the basic UB problems fixed: [Error passing by reference when making an os](https://stackoverflow.com/q/71996638) – Peter Cordes Apr 25 '22 at 09:25

2 Answers2

3

The presented code invokes undefined behavior. For example let's consider this function

void hexToStr(unsigned int hex, char** string) {
    int hexLength = 10;
    char output[hexLength];
    output[hexLength] = 0;

    int powerValue = 0;
    for (int i = 0; i < hexLength; i++) {
        output[hexLength - i - 1] = 'A';
    }

    *string = output;
}

In this assignment statement:

    output[hexLength] = 0;

there is written data outside the array because the valid range of indices is [0, hexLength).

Or the function sets a pointer passed to the function by reference to the local array output that will not be alive after exiting the function. So the returned pointer will have an invalid value.

Another example the result value of the function power when the parameter value is equal to 3 and the parameter power is equal to 3 will be equal to 81 instead of 27 due to the assignment statement in this for loop.

    for (int i = 0; i < (power - 1); i++) {
        value *= value;
    }

Moreover the function returns nothing though its return type is not void.

int power(int value, int power, int* output) {

Also this expression

(hex & (powerValue * 15)) / powerValue + '0'

does not make a sense.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • The power function is supposed to work like that for my purposes. The return type is just wrong as i have forgotten to change it as previously i was returnng a value, it is void in my code. I would have thought your point about the writing outside of array to be fine given my situation (the string is being passed to a while loop that increments until it finds a 0), and it does work in regular c run on linux, either way changing it to char output[hexLength + 1]; or the previous things mentioned does not fix my issue. – Kepler_7894i Apr 24 '22 at 20:23
  • @Kepler_7894i Undefined behavior is undefined behavior. – Vlad from Moscow Apr 24 '22 at 20:46
  • (hex & (powerValue * 15)) / powerValue + '0' Well this does function as intended so that's not the issue, the code runs fine on its own as intended the issue is why it causes a boot loop when run apart of my kernel. – Kepler_7894i Apr 24 '22 at 20:46
  • @Kepler_7894i What is "boot loop"? – Vlad from Moscow Apr 24 '22 at 20:47
  • "Undefined behavior is undefined behavior." fair enough which is why I have changed it but despite this I am still having the same issue. – Kepler_7894i Apr 24 '22 at 20:48
  • "I am making an os and have booted into a 64 bit kernel made in c." the virtual machine the code is running in is constantly rebooting due to the code being run. – Kepler_7894i Apr 24 '22 at 20:49
  • @Kepler_7894i It is due to the undefined behavior.. The function hexToStr returns an invalid pointer. – Vlad from Moscow Apr 24 '22 at 20:52
  • In that case why does the code work when run as regular c code on linux but not in my kernel. – Kepler_7894i Apr 24 '22 at 20:57
  • @Kepler_7894i It depends on the environment where the code is run. – Vlad from Moscow Apr 24 '22 at 20:58
  • I changed the *string = result; to: for (int i = 0; i < hexLength + 1; i++) { (*string)[i] = result[i]; } excpet now I am getting a segmentation fault? – Kepler_7894i Apr 24 '22 at 21:30
  • @Kepler_7894i: Undefined behaviour doesn't mean "required to fail". It may well **happen to work**, depending on exactly what asm the compiler generates, and what's in surrounding memory to get stepped on by out-of-bounds array writes. Testing can reveal problems, but a successful test can't prove correctness, especially in languages like C that aren't memory-safe and have undefined behaviour! – Peter Cordes Apr 25 '22 at 02:18
  • @Kepler_7894i: For non-void functions without a `return`, it's UB in C for the caller to use the return value. But in practice, with optimization disabled, GCC evaluates expressions using the return-value register as a temporary, so usually `foo=bar` and then falling off the end of a function returns `bar`. People [abuse this for code-golf](https://codegolf.stackexchange.com/a/106067/30206), see comments under that linked answer. – Peter Cordes Apr 25 '22 at 02:23
  • I have corrected the code above so that it is a void return type without a return. However I am still having an issue. – Kepler_7894i Apr 25 '22 at 07:28
  • @Kepler_7894i Please do not change your code in the question. This only will confuse readers of the question and answers. Close the question and ask a new question with your new updated code providing a minimal complete program that reproduces the problem. – Vlad from Moscow Apr 25 '22 at 08:20
  • Ok. I have made a new question: https://stackoverflow.com/questions/71996638/error-passing-by-reference-when-making-an-os, and have put this question back to its original code. – Kepler_7894i Apr 25 '22 at 08:35
0

Needed to enable SSE unit to work with floats and doubles. As well as change how values are passed back. Working code:

void log(float value, float base, uint64_t* output) {
    uint64_t i = 0;

    while (value >= 1) {
        value /= base;
        i++;
    }

    *output = i;
}

void pow(uint64_t value, uint64_t exponent, uint64_t* output) {
    uint64_t result = 1;

    for (uint64_t i = 0; i < exponent; i++) {
        result = result * value;
    }

    *output = result;
}

void hexToStr(uint64_t hex, char* output) {
    uint8_t hexLen = 16;
    log((float)hex, (float)16, &hexLen);
    char result[hexLen + 3];
    result[0] = '0';
    result[1] = 'x';
    result[hexLen + 2] = 0;

    uint64_t powerValue = 1;
    for (uint8_t i = 0; i < hexLen; i++) {
        pow(16, i, &powerValue);
        result[hexLen - i + 1] = (hex & (uint64_t)(powerValue * (uint64_t)15)) / powerValue + '0';
    }

    for (uint8_t i = 0; i < hexLen + 3; i++) {
        switch(result[i]) {
            case ':':
                result[i] = 'A';
                break;
            case ';':
                result[i] = 'B';
                break;
            case '<':
                result[i] = 'C';
                break;
            case '=':
                result[i] = 'D';
                break;
            case '>':
                result[i] = 'E';
                break;
            case '?':
                result[i] = 'F';
                break;
        }
        output[i] = result[i];
    }
}
  • [How do I enable SSE for my freestanding bootable code?](https://stackoverflow.com/q/31563078) explains how to actually enable SSE. You forgot to do that in your answer. However, it still makes no sense to be using floating-point at all here. You could very easily and more efficiently use `value >>= 4` for `uint64_t value` in that loop. Or even more efficiently, `(63 - __builtin_clzll(value)) / 4` to get the number of non-zero 4-bit chunks. – Peter Cordes Apr 27 '22 at 10:39
  • Also, rounding of large numbers on conversion from `uint64_t` to `float` (or even `double`) would make your function pad with a leading zero sometimes. e.g. when `0xffffffffffff` rounds up to `0x1000000000000`. (I don't think numbers can round down to one with fewer hex digits). At large magnitudes, float and double can't exactly represent every integer, only ones `2^n` apart. e.g. above 2^23, only even numbers. Around 2^63, only multiples of 2^40. Using `double` wouldn't fix this: https://en.wikipedia.org/wiki/Double-precision_floating-point_format#Precision_limitations_on_integer_values – Peter Cordes Apr 27 '22 at 10:46
  • Have a look at https://www.h-schmidt.net/FloatConverter/IEEE754.html to play around with how `float` represents large values. e.g. 2^44 -1 is 0xfffffffffff, or in decimal 17592186044415. Converting to float gives you a float that represents the value 1 higher. (Division by `(float)16` is exact, though: it's basically just subtracting 4 from the exponent of the float. The problem with your algorithm is the rounding in the initial conversion simply because you used float instead of just using uint64_t, I guess because you weren't sure how to round up?) – Peter Cordes Apr 27 '22 at 10:51