I'm looking for a set of functions that can add, subtract, multiply, and divide like 128-bit floats. I would prefer something fast, even if it does involve some abnormal precision loss.
I realize that there is a __float128
float function in C++, and there might be a 128-bit method in C, but I can sadly, only use doubles in my program. (Shortened explanation: I'm using WebAssembly, which allows you to use C-like code online, and I am using a WASM-to-C "compiler." There's only 32-bit and 64-bit floats
and ints
for that language.)
Please also provide a method to convert it to float, and a simple English or JavaScript explanation of how to initialize it. No other methods are needed!
I've seen a similar question in a another language with only 32-bit floats: WebGL highp
to 64-bit question
If anyone can create 192-bit or 256-bit floats/fixed-points, that would be useful as well.