Converting a floating point to its corresponding bit-segments

Question

Given a Ruby Float value, e.g.,

f = 12.125

I'd like to wind up a 3-element array containing the floating-point number's sign (1 bit), exponent (11 bits), and fraction (52 bits). (Ruby's floats are the IEEE 754 double-precision 64-bit representation.)

What's the best way to do that? Bit-level manipulation doesn't seem to be Ruby's strong point.

Note that I want the bits, not the numerical values they correspond to. For instance, getting [0, -127, 1] for the floating-point value of 1.0 is not what I'm after -- I want the actual bits in string form or an equivalent representation, like ["0", "0ff", "000 0000 0000"].

John, thanks for the mention of the [IEEE 754](http://en.wikipedia.org/wiki/IEEE_floating_point) representation of floats. I was not aware of that. If any readers need a reminder, an example calculation is given [here](http://class.ece.iastate.edu/arun/cpre381/ieee754/ie4.html). — Cary Swoveland, Sep 11 '14 at 16:27

Matt · Accepted Answer · 2014-09-12T10:34:32.597

The bit data can be exposed via Arrays pack as Float doesn't provide functions internally.

str = [12.125].pack('D').bytes.reverse.map{|n| "%08b" %n }.join
=> "0100000000101000010000000000000000000000000000000000000000000000"

[ str[0], str[1..11], str[12..63] ]
=> ["0", "10000000010", "1000010000000000000000000000000000000000000000000000"]

This is a bit 'around about the houses' to pull it out from a string representation. I'm sure there is a more efficient way to pull the data from the original bytes...

Edit The bit level manipulation tweaked my interest so I had a poke around. To use the operations in Ruby you need to have an Integer so the float requires some more unpacking to convert into a 64 bit int. The big endian/ieee754 documented representation is fairly trivial. The little endian representation I'm not so sure about. It's a little odd, as you are not on complete byte boundaries with an 11 bit exponent and 52 bit mantissa. It's becomes fiddly to pull the bits out and swap them about to get what resembles little endian, and not sure if it's right as I haven't seen any reference to the layout. So the 64 bit value is little endian, I'm not too sure how that applies to the components of the 64bit value until you store them somewhere else, like a 16bit int for the mantissa.

As an example for an 11 bit value from little > big, The kind of thing I was doing was to shift the most significant byte left 3 to the front, then OR with the least significant 3 bits.

v = 0x4F2
((v & 0xFF) << 3) | ( v >> 8 ))

Here it is anyway, hopefully its of some use.

class Float
  Float::LITTLE_ENDIAN = [1.0].pack("E") == [1.0].pack("D")

  # Returns a sign, exponent and mantissa as integers
  def ieee745_binary64
    # Build a big end int representation so we can use bit operations
    tb = [self].pack('D').unpack('Q>').first

    # Check what we are
    if Float::LITTLE_ENDIAN
      ieee745_binary64_little_endian tb
    else
      ieee745_binary64_big_endian tb
    end
  end

  # Force a little end calc
  def ieee745_binary64_little
    ieee745_binary64_little_endian [self].pack('E').unpack('Q>').first
  end

  # Force a big end calc
  def ieee745_binary64_big
    ieee745_binary64_big_endian [self].pack('G').unpack('Q>').first
  end

  # Little
  def ieee745_binary64_little_endian big_end_int
    #puts "big #{big_end_int.to_s(2)}"
    sign     = ( big_end_int & 0x80   ) >> 7

    exp_a    = ( big_end_int & 0x7F   ) << 1   # get the last 7 bits, make it more significant
    exp_b    = ( big_end_int & 0x8000 ) >> 15  # get the 9th bit, to fill the sign gap
    exp_c    = ( big_end_int & 0x7000 ) >> 4   # get the 10-12th bit to stick on the front
    exponent = exp_a | exp_b | exp_c

    mant_a   = ( big_end_int & 0xFFFFFFFFFFFF0000 ) >> 12 # F000 was taken above
    mant_b   = ( big_end_int & 0x0000000000000F00 ) >> 8  #  F00 was left over
    mantissa = mant_a | mant_b

    [ sign, exponent, mantissa ]
  end

  # Big
  def ieee745_binary64_big_endian big_end_int
    sign     = ( big_end_int & 0x8000000000000000 ) >> 63
    exponent = ( big_end_int & 0x7FF0000000000000 ) >> 52
    mantissa = ( big_end_int & 0x000FFFFFFFFFFFFF ) >> 0

    [ sign, exponent, mantissa ]
  end
end

and testing...

def printer val, vals
  printf "%-15s   sign|%01b|\n",            val,     vals[0]
  printf "  hex e|%3x|         m|%013x|\n", vals[1], vals[2]
  printf "  bin e|%011b| m|%052b|\n\n",     vals[1], vals[2]
end

floats = [ 12.125, -12.125, 1.0/3, -1.0/3, 1.0, -1.0, 1.131313131313, -1.131313131313 ]

floats.each do |v|
  printer v, v.ieee745_binary64
  printer v, v.ieee745_binary64_big
end

TIL my brain is big endian! You'll note the ints being worked with are both big endian. I failed at bit shifting the other way.

(Note that `G` always picks big-endian, so it's not necessarily the native format as per my question. x86 and x86_64 are little-endian, for example, and would need to be reversed.) — John Feminella, Sep 10 '14 at 18:33
But then that would change the way you extract the values depending on endianness. Wouldn't 'G' give you the platform independant result? Sorry, I don't live in an endian world =) — Matt, Sep 10 '14 at 18:52
Right, there would need to be a detection method to figure it out (e.g. use a reference float like 1.0 and compare the bits to decide if it's little-endian or big-endian). — John Feminella, Sep 10 '14 at 21:18
So you want the local endian representation for each of the 3 values? — Matt, Sep 11 '14 at 09:55
Right. However, I think this answer is close enough for my purposes so I'm going to accept it. Thanks for your help! — John Feminella, Sep 11 '14 at 14:09

score 3 · Answer 2 · answered Sep 10 '14 at 15:55

3

Use frexp from the Math module. From the doc:

fraction, exponent = Math.frexp(1234)   #=> [0.6025390625, 11]
fraction * 2**exponent                  #=> 1234.0

The sign bit is easy to find on its own.

answered Sep 10 '14 at 15:55

Patrice Gahide

3,644
1
27
37

The values returned by `frexp` do not exactly correspond to the IEEE 754 convention, although they are equivalent to it. If you are really interested in the bits, you need to take that into account. The (probably historical) reasons are guessed at in http://stackoverflow.com/questions/24928833/why-does-frexp-not-yield-scientific-notation – Pascal Cuoq Sep 10 '14 at 15:59
I'm aware of `frexp`, but I want the actual bits, not the numerical values they correspond to -- for example, for the exponent I'd expect to see an 11-character string of 0s and 1s, or a 3 character string of hex digits, etc. – John Feminella Sep 10 '14 at 16:36

Converting a floating point to its corresponding bit-segments

2 Answers2