11

Scientific notation is the common way to express a number with an explicit order of magnitude. First a nonzero digit, then a radix point, then a fractional part, and the exponent. In binary, there is only one possible nonzero digit.

Floating-point math involves an implicit first digit equal to one, then the mantissa bits "follow the radix point."

So why does frexp() put the radix point to the left of the implicit bit, and return a number in [0.5, 1) instead of scientific-notation-like [1, 2)? Is there some overflow to beware of?

Effectively it subtracts one more than the bias value specified by IEEE 754/ISO 60559. In hardware, this potentially trades an addition for an XOR. Alone, that seems like a pretty weak argument, considering that in many cases getting back to normal will require another floating-point operation.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • 1
    I think this question can be simplified: In traditional text-book scientific notation for a number base *b*, the significant is chosen to lie in the range [1, *b*). But `frexp` uses the convention to put the significand in the range [1 / *b*, 1). Which one is right? Which one is better? I think it's a bit like asking whether range indexes should start at zero or at one. – Kerrek SB Jul 24 '14 at 08:41
  • @KerrekSB Actually, I'm just implementing a logarithm library now, and `frexp` is *not* appropriate for that. The integral part of the logarithm is what would be returned for the [1, 2) convention. I guess my question could be answered with examples of applicability of [1/b, 1), but really I'm wondering why `frexp` was designed this way, by whoever originally designed it. – Potatoswatter Jul 24 '14 at 08:49
  • You're right - in fact, `[1, b)` is just the exponential of the range `[0, 1)`, which we already agreed was [the right way to denote ranges](http://stackoverflow.com/q/9963401/596781). – Kerrek SB Jul 24 '14 at 08:54
  • `frexp()` does yield "scientific notation". "scientific notation" is `a` times `power(base, exponent)` [Ref](http://en.wikipedia.org/wiki/Scientific_notation). `a` isn't obliged to a narrow range. A "normalized scientific notation" has a restricted range for `a`. First a nonzero digit, then a radix point, then a fractional part, and the exponent." is an overly narrow definition of "scientific notation". The post is still a _good_ question as to why `frexp()` returns a `significand` and `exponent` as it does. It is that the title could state the issue in a more direct and unbiased fashion. – chux - Reinstate Monica Sep 10 '14 at 16:53

1 Answers1

14

The rationale says:

4.5.4.2 The frexp function

The functions frexp, ldexp, and modf are primitives used by the remainder of the library. There was some sentiment for dropping them for the same reasons that ecvt, fcvt, and gcvt were dropped, but their adherents rescued them for general use. Their use is problematic: on nonbinary architectures ldexp may lose precision, and frexp may be inefficient.

One can speculate that the “remainder of the library” was more convenient to write with frexp's convention, or was already traditionally written against this interface although it did not provide any benefit.

I know that this does not fully answer the question, but it did not quite fit inside a comment.

I should also point out that some of the choices made in the design of the C language predate IEEE 754. Perhaps the format returned by frexp made sense with the PDP-11's floating-point format(s), or any other architecture on which a function frexp was first introduced. EDIT: See also page 155 of the manual for one PDP-11 model.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • 1
    Thanks, that's certainly more than a comment! Not surprising that its genesis was UNIX, but it's interesting that someone once tried to kill it. – Potatoswatter Jul 24 '14 at 09:44
  • 5
    Yes, the PDP-11 floating point is documented as having the binary point to the left of the "hidden" bit... So for exponent == 0, the number is, as per frexp(), 0.5..0.99999... Interestingly, there is little difference between the PDP-11 and IEEE formats -- in effect the difference is the bias applied to the exponent, for IEEE the bias is 127 (for single length) where for the PDP-11 the same bit pattern (modulo byte ordering) has (effectively) a bias of 129. –  Jul 24 '14 at 23:11
  • @gmch That would make a great answer. It smells like a smoking gun, even if we can't be sure there's no other reason. – Potatoswatter Jul 25 '14 at 01:40
  • 1
    Actually, I'll just accept this. See page 155 of [this manual](http://pdos.csail.mit.edu/6.828/2005/readings/pdp11-40.pdf). – Potatoswatter Jul 25 '14 at 01:55