-1

The following Python3+ code attemps to compile a Cpp script and use it to convert from float to int while leaving the memory untouched; it is as follows:

import sys, os
import numpy as np
import matplotlib.pyplot as plt 

# Build a C++ Script that accepts a str,
# converts it to a float, and prints 
# the result of the operation
def build():
    script = """
#include<iostream>

int main(int argc, char *argv[]){

  float f = std::stof(argv[1]);
  int i = *(short *)&f;


  std::cout << f << " " << i <<std::endl;

  return 0;

}
    """
    with open('script.cpp', 'w') as f:
        f.write(script)

    return 1

# Loads the results from the C++ script
def load_results():
    x,y = [],[]
    with open('results-ctest.txt', 'r') as f:
        result = f.readlines()
    for _ in result:
        local = _.split(' ')
        x.append(float(local[0]))
        y.append(int(local[1][:-2]))

    return x,y

# Plots the results from the C++ script
def show_results(x,y):
    # Define a figure
    f,ax = plt.subplots()

    # Plot results
    ax.scatter(x,y)

    # Format the axis according to the shown figure
    ax.set_xticks(np.linspace(min(x), max(x), 20))
    ax.set_yticks(np.linspace(min(y), max(y), 20))
    plt.show()

if __name__=='__main__':

    # build the C++ script
    build()

    # Compile the C++ script
    # and clean the previous results
    # by removing "results-ctest.txt"
    os.system(f'g++ script.cpp')
    os.system('rm results-ctest.txt')

    # Generate 500 floats between -1.000.000 and 1.000.000
    # and pass them to the C++ script
    numbers=np.linspace(-1e6, 1e6, 500)
    for number in numbers:
        os.system(f'./a.out {number}>> results-ctest.txt')

    # Open the results of the C++ script and 
    # split the input from the output
    x,y = load_results()

    # Produce the figure and open
    # a window for it
    show_results(x,y)

The apparent problem is that the (output) integers vs (input) floats are as follows:

output integers vs input floats as computed by the attached python 3+ code

Nevertheless, if both "int" and "float" are implemented with 4 bytes according to the following figure then the input and the output should have the same sign.

float(above) and int(below) 4-byte implementation

The summary is that what follows is being used to create an int using C++ and the sign of it is not being preserved according to what is shown in the first figure.

  float f = std::stof(argv[1]);
  int i = *(short *)&f;

thank you


EDIT: The bottom line is that there was a typo. I am editing the question in order to show the 'correct' plot

As stated in the comments, the problem was the following line:

int i = *(short *)&f;

which should have been:

int i = *(int *)&f;

and thus yield the following chart: Results after fixing the typo

Gaston
  • 537
  • 4
  • 10
  • 6
    `short` is likely 2 bytes, not 4. – Bill Lynch Dec 25 '20 at 20:55
  • 1
    `int i = *(short *)&f;` -- [Strict aliasing rule](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) – PaulMcKenzie Dec 25 '20 at 20:59
  • Bill Lynch gave the actual answer to your question. As an aside, you're invoking undefined behavior, see e.g. https://en.cppreference.com/w/c/language/object#Strict_aliasing. If you want to achieve the same thing without undefined behavior, use `memcpy`. – Cereal Dec 25 '20 at 21:01
  • This is not how you convert floats to ints in C++. Which C++ textbook taught you this? Whichever one it is, you need a better C++ textbook. – Sam Varshavchik Dec 25 '20 at 21:06
  • 1
    @SamVarshavchik I doubt this is about numerical conversion, I think he really wants to look at how the byte representation of floats would map to integers. – Cereal Dec 25 '20 at 21:10
  • What TF is your Python question? Please, make sure you reduce code in your questions to a [mcve] before asking here! As a new user, also take the [tour] and read [ask]. – Ulrich Eckhardt Dec 25 '20 at 21:26
  • A concise shallow answer is what BillLynch and PaulMcKenzie pointed out: there is a typo at the conversion (see edit). Question to experienced users: should I correct the typo at the title? I think not. – Gaston Dec 26 '20 at 14:23
  • @UlrichEckhardt I edited it in order to make the code more readable. I hope now the code-quality is better aligned with the site's rules – Gaston Dec 26 '20 at 14:38

1 Answers1

1

The first problem with your code is that it is undefined behavior to read data which is one type as another unrelated type without certain properties, like a common prefix, or using a few types, like std::byte or char.

std::bit_cast is a way to do it correctly.

The second problem is that what the exact bits mean will vary based on things like the endianness of your computer and which floating point standard you are using. These are relatively standard now, but not completely.

Your third problem is that the size of a short, int and float are platform and compiler specific. There are fixed sized integer types you can use, like std::int32_t, and you should use that instead of int or short. Often short is 16 bits and int is 32, but that is far from universal. float being 32 bits is really common.

So:

std::int32_t i = std::bit_cast<std::int32_t>(f);

std::cout << f << " " << i <<std::endl;

will at least get rid of most of the insane issues.

I don't know off hand what the endianness conversion issues are when converting from floating point to integers. What I'd so is this:

std::uint32_t ui = std::bit_cast<std::uint32_t>(f);
std::cout << f << " 0b";
for (int i = 0; i < 32; ++i)
  std::cout << (ui&(1<<i));

std::int32_t i = std::bit_cast<std::int32_t>(f);
std::cout << " " << i <<std::endl;

to also dump out the bits of f as interpreted by the endianness of your architecture. Then take a floating point value whose bit representation you know (and isn't, like, all 0s), and look at what this generates.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • This is a great comment that expands on the typo (please read edit) and adds more information. I will accept it hoping you can find the time to add a mention to how `int i = *(int *)&f` does preserve the sign in probably most vainilla implementations as far as I understand. Thanks – Gaston Dec 26 '20 at 14:05