Advantages and disadvantages of floating point and fixed point representations

Question

I have been trying for the last three days to understand the exact differences between floating and fixed point representations. I am confused reading the material and I'm unable to decide what is right and what is wrong.

One of the problems is with the meaning of few technical terms like 'precision', 'mantissa', 'denormalised', 'underflows' etcetera.

Can anyone give the differences with examples?

The points I have been able to find out until now (and able to understand clearly) are as follows :-

Floating point -
1. Advantage Provides a very large range
2. Disadvantage Rounds off large numbers

Fixed point -
1. Advantage Numbers are represented exactly (Used when 'money' is involved)
2. Disadvantage Provide a very limited range

But I know there are a lot more differences (Advantages and disadvantages mainly). Can anyone list them out with explanations?

Wikipedia should give you at least a grounding in terminology, but if you are willing to devote some time to the subject, the paper by [David Goldberg](http://cr.yp.to/2005-590/goldberg.pdf) should give you an excellent understanding of the concepts, limitations, and subtleties of floating-point arithmetic. You don't need to understand every proof, but if you're serious about C/C++ (or any?) programming as you mention, this stuff is vital. — Brett Hale, Mar 10 '12 at 15:26
@Hale Thanks for your help :) Basically I began exploring this topic when I stumbled upon datatypes in Oracle SQL . Also I am learning Principles of programming languages in my college . — progammer, Mar 10 '12 at 15:40
Numbers are not represented exactly in fixed point. I don't know what gave you that idea. You may think that it's better for currency amounts because since you have to implement fixed-point yourself, you can do *decimal fixed-point*, but the axes binary/decimal and fixed-point/floating-point are orthogonal. — Pascal Cuoq, Feb 26 '15 at 16:12

score 6 · Answer 1 · answered Mar 10 '12 at 13:17

The technalities behind floating point take a lot of time to get used to. I will not go into detail here.

Simply stated, floating points achieve a high domain (from very small numbers close to zero to very high numbers, sometimes even higher than the number of atoms in the universe). Floating points achieve this by keeping the relative error constant. I.e. the number will start to be rounded after an fixed number of decimals (this is a simplification, but helps to understand the principle). This is very similar to the concept of "significant figures" from most natural sciences. However this means that floating point numbers are always somehow rounded. If you add a very small number to a very big number, the small number will just be truncated and the big number will stay. This happens when the small number is below the the threshold. If you add many numbers it might sometimes be necessary to sort them first and adding the small ones before the big ones. There is also the concept of numeric stability to consider, i.e. how an algorithm will drift of from the correct result due to the rounding.

Fixed point representation on the other hand will always have the same absolute error. If you store currency with 4 decimal places, you know your data will be off by a maximum of .00005 cent. If you add your data however, this error again might accumulate, but the rules for this are a lot different from the rules for floating points.

Unless you are duing heavy duty numeric work, these problems probably should not be considered. Most of the times floating point numbers and fixed point numbers work just fine, when good care is taken (i.e. never use == on floating point numbers or fixed point numbers. The correct way to compare them however is differnt for both). Also AFAIK floats are used more often in scientific work, because most often the scientists will have training in numerics, know how to deal with rounding and are only interested in relatively exact results. Fixed points are used in finances, where each rounding has to be accounted and stored somewhere (often the banks will just keep the rounded half microcents), so you have to have a very good controll of the absolute error to be later able to account for it.

I understand the technicalities (the representation) atleast to some extent . Thanks for the very simple explanation :) Can you tell me the difference between having a constant order relative error and a constant order absolute error — progammer, Mar 10 '12 at 13:22
"Unless you are doing heavy duty numeric work" - I couldn't disagree more with this statement. That one-in-a-million case where the results blow up due to "catastrophic cancellation", that can't be easily replayed and debugged. Or sloppy numerical methods hidden in opaque / complex code libraries. Better to quantify any source of floating-point error and handle it correctly, like any other 'exceptional' condition in a program. — Brett Hale, Mar 10 '12 at 15:43
@BrettHale: Ok, I guess I should have been more carefull with such a statement. My usuall work really does not contain much floating point operations, so I am not very experienced in it (at least not much more, than what I have learned in classes). I think I should be more carefull with such statements. — LiKao, Mar 11 '12 at 01:07

score 3 · Answer 2 · answered Mar 10 '12 at 13:04

3

Floating point numbers are good for, well, floating points, i.e. when you need to express numbers across varying scales. You sacrifice precision to gain range of scale.

On the other hand, fixed point numbers are only suitable at a fixed scale (and they'll over- or underrun if you scale them too much), but you gain precision as long as you remain within the desired scale.

In short: If you multiply a lot but don't add numbers of different scales, use floating points. If you add a lot but don't multiply, use fixed points.

(A good example of a fixed-point use case is anything relating with currency: Essentially, you can fix your unit to be cents, or one hundredth of a cent, and make all your monetary values be integers in that unit.)

answered Mar 10 '12 at 13:04

Kerrek SB

464,522
92
875
1,084

Hi kerrek :) Can u tell me EXACTLY what is meant by 'precision'.. I understand the English meaning . But in for example Oracle SQL , we define a number(precision,scale) , where precision means total number of 'digits' . Its just confusing ! – progammer Mar 10 '12 at 13:06
Floating point Advantages: High precision Disadvantages: Expensive in terms of area and power (computationally intensive) Fixed point Advantages: area and power efficient Disadvantages: dynamic range and accuracy – progammer Mar 10 '12 at 13:08
@Appy: Precision refers to the amount of information. For example, if you measure the length of something, you could say it's 1m long, or 1.2m or 1.2041m. All those are 'correct' to some degree, but the measurements are of increasing *precision* -- i.e. they contain more significant digits worth of information (at the expense of requiring more storage, and also finer instruments). Also, it's important to understand that higher precision does not mean higher accuracy: If the object is truly 1.2100m long, then 1.2m is accurate at its precision, but 1.2041m is not accurate at *its* precision. – Kerrek SB Mar 10 '12 at 13:10
This is what I saw on a pdf document . Heres the link - http://www.ee.ucla.edu/~ingrid/Courses/ee201aS02/lectures/ee201a_full_presJatinL10.pdf – progammer Mar 10 '12 at 13:11
I am sorry , but I dint understand the difference between accuracy and precision , and also the concept of accuracy ! I thought 1.2 is less accurate than 1.2041 (in your example) because 1.2100-1.2 = 0.01 and 1.2100-1.2041=0.0059 ! :O – progammer Mar 10 '12 at 13:15
1

Precission is refering to the amount of information given. I.e. how many figures are there. If you have n figures, the precision can be measured from that. A number with n figures after the . is suggesting it can measure correctly at this level of detail. I.e. for 1.2 it tells you that it can measure the number up to one decimal. It does not tell you anything about the next decimals, so for the promise it gives you it is correct. The number 1.2041 tells you, that it can measure up to four decimals, but it is wrong at the second decimal. I.e. it has higher precision, but it is wrong. – LiKao Mar 10 '12 at 13:27

score 0 · Answer 3 · edited Dec 05 '17 at 22:50

Fixed point is a representation of floating point number in integer format. So operations can be applied on the number just like on integers. The advantage of using this is that floating point arithmetic is costlier (processing power). Newer processors have dedicated FPUs (floating point units) for handling that.

So fixed point arithmetic is when processing power is limited, and a little precision loss doesn't cause a havoc.

score -1 · Answer 4 · answered Apr 25 '15 at 21:44

-1

Fixed pointed numbers can be sorted in linear time. Fixed point is also unambiguous; each numerical value that can be expressed in specific fixed point protocol has only one representation, which isn't true with floating point.

Floating point has a much wider representable range. It is also ambiguous. Floating point numbers can be sorted in NlogN time.

answered Apr 25 '15 at 21:44

Ian

7
1

Hi Ian, I have a couple of questions about your answer: **a)** you say that sorting fixed point numbers can be sorted in O(N) time and floating point in O(N log N). Can you explain that? I can't see why that would be. **b)** what do you mean by floating point numbers are ambiguous? ——— On second thought, I can't see any reason why there should be any difference in time complexity of sorting fixed and floating point numbers. Could you explain please? – Wai Ha Lee Apr 25 '15 at 22:09
Floating-point numbers can be sorted as if they were integers. They were made this way on purpose. – Yay295 Mar 12 '16 at 23:33

Advantages and disadvantages of floating point and fixed point representations

4 Answers4

Linked