How to quickly pack a float to 4 bytes?

Question

I've been looking for a way to store floats on WebGL textures. I've found some solutions on the internet, but those only deal with floats on the [0..1) range. I'd like to be able to store arbitrary floats, and, for that, such a function would need to be extended to also store the exponent (on the first byte, say). I don't quite understand how those work, though, so it is not obvious how to do so. In short:

What is an efficient algorithm to pack a float into 4 bytes?

I'm tagging C/OpenGL because an answer on those would be easily portable to WebGL. Is that OK? — MaiaVictor, Aug 19 '16 at 14:08
answers in C will be quite different than anwsers in WebGL. For example, in C, you can basically just directly reinterpret the bytes (a float is already 4 bytes); in JavaScript, you would need a different solution. — Michael Aaron Safyan, Aug 19 '16 at 14:10
"*because an answer on those would be easily portable to WebGL*" Nope. More recent versions of OpenGL has specific GLSL functions for packing/unpacking floats. Functions that WebGL *does not have*. — Nicol Bolas, Aug 19 '16 at 14:33
How about [floating point textures](https://www.khronos.org/registry/webgl/extensions/OES_texture_float/) ? At 95.5% availability I would consider them quite safe to use. @MichaelAaronSafyan Well its not much different in javascript `new Uint8Array(new Float32Array([523.151231]).buffer)` does exactly that. — LJᛃ, Aug 19 '16 at 14:36
How about compromising on the accuracy of the floats - let's say store up to 5 digits after the period. And so multiple by 100000, round it down to a long and store that number by iteratively doing: 1. store in byte x % 2^8 2. x = x / 2 ^ 8 repeat — user967710, Aug 19 '16 at 14:39
I've been reading about floating point textures for some time but I think I'm really missing a complete explanation or concrete example of how it works. If I send a texture with `gl.FLOAT`, but inside the shader, it is still a `vec4`, and I'm not sure what that vec4 contains. How it works? On what component is my float? Is it interpreted as a RGBA color somehow? If so, how? Etc. — MaiaVictor, Aug 19 '16 at 14:41
Its not reinterpreted at all. When you create an `RGBA` texture using `FLOAT` you get 4 channels of 32bit floats so a 128bit texture. — LJᛃ, Aug 19 '16 at 14:44
So each channel goes from a byte to 32 bits? How so if WebGL floats are up to 24-bits? — MaiaVictor, Aug 19 '16 at 14:46
Who says WebGL floats are "up to 24-bits" ?! There is no standard for 24-bit floats, you either have 16-bit *half* floats, 32-bits floats or 64bit-*doubles* (not supported). — LJᛃ, Aug 19 '16 at 14:48
The only 24-bits limited thing is the system provided backbuffer you're supposed to render your final image to. This however are 8-bits per channel unsigned byte values, when rendering directly to the backbuffer your `gl_FragColor` values are scaled into that range(0-255). Note that fragmentshader precision is independent from output storage(renderbuffer or texture in WebGL) precision. — LJᛃ, Aug 19 '16 at 15:07
Given @LJᛃ's comment implying that, even in Javascript, it's possible to simply *reinterpret* a 4-byte float as 4 separate bytes as is possible in C, I'm absolutely bewildered as to what needs "encoding". — j_random_hacker, Aug 19 '16 at 15:31
Related: https://stackoverflow.com/questions/34963366/encode-floating-point-data-in-a-rgba-texture?rq=1 — Brian Cannard, May 07 '20 at 18:29
A lot of confusion around WebGL 1.0 and GLSL 1.00 floating point precision is that the standard doesn't require float32 but allows for less bits (awkward). A lot of type checkers and literal _shrinkers_ in GLSL 1.00 preprocessor do follow that awkward standard despite the GPU hardware doing the right thing. So I discovered that bit-by-bit, full format conversion from IEEE 754 float32 to RGBA 8888 (for all smartphone browsers) is *doable* and posted an answer to this question after several days of trying to make it work. — Brian Cannard, May 08 '20 at 08:01

Brian Cannard · Accepted Answer · 2021-02-23T23:59:13.843

It's not fast, but doable. (Note that GLSL 1.00 floating point literals have conversion bugs in the compiler).

struct Bitset8Bits {
    mediump vec4 bit0;
    mediump vec4 bit1;
    mediump vec4 bit2;
    mediump vec4 bit3;
    mediump vec4 bit4;
    mediump vec4 bit5;
    mediump vec4 bit6;
    mediump vec4 bit7;
};


vec4 when_gt (vec4 l, vec4 r) {
  return max(sign(l - r), 0.0);
}


Bitset8Bits unpack_4_bytes (lowp vec4 byte) {
    Bitset8Bits result;

    result.bit7 = when_gt(byte, vec4(127.5));
    vec4 bits0to6 = byte - 128.0 * result.bit7;

    result.bit6 = when_gt(bits0to6, vec4(63.5));
    vec4 bits0to5 = bits0to6 - 64.0 * result.bit6;

    result.bit5 = when_gt(bits0to5, vec4(31.5));
    vec4 bits0to4 = bits0to5 - 32.0 * result.bit5;

    result.bit4 = when_gt(bits0to4, vec4(15.5));
    vec4 bits0to3 = bits0to4 - 16.0 * result.bit4;

    result.bit3 = when_gt(bits0to3, vec4(7.5));
    vec4 bits0to2 = bits0to3 - 8.0 * result.bit3;

    result.bit2 = when_gt(bits0to2, vec4(3.5));
    vec4 bits0to1 = bits0to2 - 4.0 * result.bit2;

    result.bit1 = when_gt(bits0to1, vec4(1.5));
    vec4 bit0 = bits0to1 - 2.0 * result.bit1;

    result.bit0 = when_gt(bit0, vec4(0.5));

    return result;
}

float when_gt (float l, float r) {
  return max(sign(l - r), 0.0);
}




vec4 pack_4_bytes (Bitset8Bits state) {

  vec4 data;

  data = state.bit0
    + 2.0 * state.bit1
    + 4.0 * state.bit2
    + 8.0 * state.bit3
    + 16.0 * state.bit4
    + 32.0 * state.bit5
    + 64.0 * state.bit6
    + 128.0 * state.bit7;

  return data;
}

vec4 brians_float_pack (
    float original_value) {

    // Remove the sign
    float pos_value = abs(original_value);

    float exp_real = floor(log2(pos_value));
    float multiplier = pow(2.0, exp_real);
    float normalized = pos_value / multiplier - 1.0;

    float exp_v = exp_real + 127.0;
    // if exp_v == -Inf -> 0
    // if exp_v == +Inf -> 255
    // if exp_v < -126.0 -> denormalized (remove the "1")
    // otherwise + 127.0;

    Bitset8Bits packed_v;

    packed_v.bit7.a =
        step(sign(original_value) - 1.0, -1.5); // pos

    // Exponent 8 bits

    packed_v.bit6.a = when_gt(exp_v, 127.5);
    float bits0to6 = exp_v - 128.0 * packed_v.bit6.a;

    packed_v.bit5.a = when_gt(bits0to6, 63.5);
    float bits0to5 = bits0to6 - 64.0 * packed_v.bit5.a;

    packed_v.bit4.a = when_gt(bits0to5, 31.5);
    float bits0to4 = bits0to5 - 32.0 * packed_v.bit4.a;

    packed_v.bit3.a = when_gt(bits0to4, 15.5);
    float bits0to3 = bits0to4 - 16.0 * packed_v.bit3.a;

    packed_v.bit2.a = when_gt(bits0to3, 7.5);
    float bits0to2 = bits0to3 - 8.0 * packed_v.bit2.a;

    packed_v.bit1.a = when_gt(bits0to2, 3.5);
    float bits0to1 = bits0to2 - 4.0 * packed_v.bit1.a;

    packed_v.bit0.a = when_gt(bits0to1, 1.5);
    float bit0 = bits0to1 - 2.0 * packed_v.bit0.a;

    packed_v.bit7.b = when_gt(bit0, 0.5);

    // Significand 23 bits

    float factor = 0.5;
    // 0.4999999
    
    // Significand MSB bit 22:
    packed_v.bit6.b =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit6.b;
    factor = 0.5 * factor;

    packed_v.bit5.b =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit5.b;
    factor = 0.5 * factor;

    packed_v.bit4.b =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit4.b;
    factor = 0.5 * factor;

    packed_v.bit3.b =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit3.b;
    factor = 0.5 * factor;

    packed_v.bit2.b =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit2.b;
    factor = 0.5 * factor;

    packed_v.bit1.b =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit1.b;
    factor = 0.5 * factor;

    packed_v.bit0.b =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit0.b;
    factor = 0.5 * factor;


    packed_v.bit7.g =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit7.g;
    factor = 0.5 * factor;

    packed_v.bit6.g =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit6.g;
    factor = 0.5 * factor;

    packed_v.bit5.g =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit5.g;
    factor = 0.5 * factor;

    packed_v.bit4.g =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit4.g;
    factor = 0.5 * factor;

    packed_v.bit3.g =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit3.g;
    factor = 0.5 * factor;

    packed_v.bit2.g =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit2.g;
    factor = 0.5 * factor;

    packed_v.bit1.g =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit1.g;
    factor = 0.5 * factor;

    packed_v.bit0.g =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit0.g;
    factor = 0.5 * factor;


    packed_v.bit7.r =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit7.r;
    factor = 0.5 * factor;

    packed_v.bit6.r =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit6.r;
    factor = 0.5 * factor;

    packed_v.bit5.r =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit5.r;
    factor = 0.5 * factor;

    packed_v.bit4.r =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit4.r;
    factor = 0.5 * factor;

    packed_v.bit3.r =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit3.r;
    factor = 0.5 * factor;

    packed_v.bit2.r =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit2.r;
    factor = 0.5 * factor;

    packed_v.bit1.r =
        when_gt(normalized, factor - 0.00000005);
    normalized = normalized - factor * packed_v.bit1.r;
    factor = 0.5 * factor;

    // LSB bit 0
    packed_v.bit0.r =
        when_gt(normalized, factor - 0.00000005);

    vec4 result = pack_4_bytes(packed_v);

    return result;
}

Special thanks to Evan Wallace https://evanw.github.io/float-toy/ for helping me to figure out the format and debug it. — Brian Cannard, May 08 '20 at 07:55
Amazing, I hope people find your answer when facing this awful problem. — MaiaVictor, May 09 '20 at 14:18

score 2 · Answer 2 · answered Aug 19 '16 at 16:24

An easy way to do it is to first agree on the range of float you are supporting and remap it to [0...1) range before packing.

const MIN = -100;
const MAX = 100;

function packRemap(v){
    return (v - MIN) / (MAX - MIN);
}

function unpackRemap(p){
    return MIN + p * (MAX - MIN);
}

score 1 · Answer 3 · answered Aug 19 '16 at 16:47

Well, float is an 32-bit number (23 bits for mantissa + 1 bit implicitly, 8 bits for exponent and 1 bit for sign) and a texel of RGBA8 texture is also 32 bit. Thus we only need an encoding scheme, which can be packed in JS (or any other language for that matter) and unpacked in GLSL (given restrictions os GLSL ES 1.0, e.g. lack of bitwise ops). Here's my suggestion (in C++):

#include <cstdint>
#include <iostream>
#include <cmath>

// for storing result of encoding
struct rgba {
    uint8_t r, g, b, a;
};

rgba float2rgba(float x) {
    union {
        float xc;
        uint32_t xi;
    };

    // let's "convert" our float number to uint32_t so we can mess with it's bits
    xc = x;

    // in v we'll pack sign bit and mantissa, that would be exactly 24 bits
    int_least32_t v =
        // sign bit
        (xi >> 31 & 1) |
        // mantissa
        ((xi & 0x7fffff) << 1);

    rgba r;

    // then we just split into bytes and store them in RGB channels
    r.r = v / 0x10000;
    r.g = (v % 0x10000) / 0x100;
    r.b = v % 0x100;

    // and we'll put the exponent to alpha channel
    r.a = xi >> 23 & 0xff;

    return r;
}

float rgba2float(rgba r) {
    // let's "rebuild" mantissa and sign bit first
    uint32_t v = (r.b / 2) + r.g * 0x80 + r.r * 0x8000;

    return
        // let's apply sign (it's in least significant bit of v)
        (r.b % 2 ? -1.f : 1.f) *
        // and reconstruct the number itself
        (1.f + v * pow(2.f, -23.f)) * pow(2.f, static_cast<unsigned>(r.a) - 127);
}

int main() {
    const float a = -1.34320e32f;
    rgba r = float2rgba(a);
    std::cout <<
        a << '\n' <<
        static_cast<unsigned>(r.r) << ',' <<
        static_cast<unsigned>(r.g) << ',' <<
        static_cast<unsigned>(r.b) << ',' <<
        static_cast<unsigned>(r.a) << '\n' <<
        rgba2float(r) << std::endl;
}

Output:

-1.3432e+32
167,214,213,233
-1.3432e+32

Hey, that is neat, thanks! Although I do regret asking for C++ since many operations there aren't available on GLSL ES 2.0 :( — MaiaVictor, Aug 19 '16 at 19:04
Yeah, right now I don't see a way to port packing code I've suggested to GLSL ES 2.0, but **unpacking** (like, reading from texture rgbas and converting them back to floats) is possible (only arithmetic ops used there). I'll update my answer with GLSL snippet a bit later. — Kirill Dmitrenko, Aug 19 '16 at 19:06

score 1 · Answer 4 · answered Aug 19 '16 at 18:48

Since I couldn't find anything that solves my problem, I've assembled this solution:

function fract(x){ 
  return x - Math.floor(x);
};

function packFloat(x) {
  var s = x > 0 ? 1 : -1;
  var e = Math.floor(Math.log2(s*x));
  var m = s*x/Math.pow(2, e);
  return [
    Math.floor(fract((m-1)*256*256)*256),
    Math.floor(fract((m-1)*256)*256),
    Math.floor(fract((m-1)*1)*256),
    ((e+63) + (x>0?128:0))];
}

function unpackFloat(v){
  var s = v[3] >= 128 ? 1 : -1;
  var e = v[3] - (v[3] >= 128 ? 128 : 0) - 63;
  var m = 1 + v[0]/256/256/256 + v[1]/256/256 + v[2]/256;
  return s * Math.pow(2, e) * m;
};

for (var i=0; i<10; ++i){
  var num = (Math.random()*2.0-1.0)*1000;
  console.log(num, packFloat(num), unpackFloat(packFloat(num)));
}

It converts a float to 4 bytes, back and forth. As opposed to other solutions, it isn't restricted to a small or pre-defined range, and is able to represent any number on the shape s * m * 2^e, where s = -1 or 1, m = 1 til 2 (with 24 bits of precision), and e = -63 to 64. Porting it to GLSL is trivial since it only uses common floating point operations.

Nicely done. Although you might want to rewrite it in GLSL to avoid branching if possible. Also want to point out that the real difference between exponentiation and fixed linear range is precision. Linear range gives the same level of precision in the interval. Whereas the precision in exponentiation drops off as you deviate from center. Whichever representation is best depends on the distribution of your floats of course. — WacławJasper, Aug 19 '16 at 20:45

score 1 · Answer 5 · edited May 23 '17 at 12:10

1

I'm not sure I'm understanding this question but.

Why not just use floating point textures?

var ext = gl.getExtension("OES_texture_float");
if (!ext) {
   // sorry no floating point support)
}

As for putting data into the texture you just use Float32Array.

var data = new Float32Array([0.123456, Math.sqrt(2), ...]);
gl.texImage2D(gl.TARGET_2D, 0, gl.RGBA, width, height, 0, 
              gl.RGBA, gl.FLOAT, data);

Reading from floating point textures is supported on most hardware. Rendering to floating point textures is less supported. See WebGL iOS render to floating point texture

Let me also point out you can get bytes out of a float in JavaScript

var arrayOf10Floats = new Float32Array(10);
var arrayOf40bytes = new Uint8Array(arrayOf10Floats.buffer);

Those two arrays share the same memory. They are both just ArrayBufferViews of the underlying ArrayBuffer.

edited May 23 '17 at 12:10

Community

1
1

answered Aug 21 '16 at 02:54

gman

100,619
31
269
393

That is a great suggestion, thanks. I avoided them because they didn't work with `headless-gl`, but now I think it might be a better solution in general. – MaiaVictor Aug 28 '16 at 03:50
Not possible in ES2. – Aug 22 '19 at 19:36
@Pixel, this question isn't about ES2 it's about WebGL. If you want info about ES2 ask another question. Tho, ES2 via extensions does support all of this. See the [ES2 extension header](https://www.khronos.org/registry/OpenGL/api/GLES2/gl2ext.h) where you'll find `OES_texture_float` and `EXT_color_buffer_float` both defined as valid ES2 extensions. If your particular drivers/device doesn't support floating point that doesn't mean ES2 doesn't support it, just your device. Which is actually the same for WebGL. It's an optional feature. – gman Aug 23 '19 at 04:47
For WebGL2 I agree (according to the spec https://www.khronos.org/registry/webgl/specs/latest/2.0/), but WebGL1 is based on ES2 (https://www.khronos.org/registry/webgl/specs/1.0/). Extensions may work though as you say, but cant be guaranteed for general use. – Aug 23 '19 at 05:47
Not sure what your point is. That's clear from the answer. It's called an "extension" and even shows checking for it. – gman Aug 23 '19 at 08:19

How to quickly pack a float to 4 bytes?

5 Answers5