I will try to answer generically, without specific and gory details and without being bound by particular floating point standard. If you are interested in those, you should familiarize yourself with IEEE 754 floating point standard - a daunting task. While C/C++ implementations are not required to follow it, they usually do and it is an authoritive source worth understanding for someone deeply interested in the topic.
First of all, generically floating point numbers are represented as two distinctive parts - significant number and exponent. This is easy to understand for someone familiar with scientific notation. In scientific notation, 42.42
can be represented as 4242 * 10 ^ -2
. (Where ^ -2
means 10 to the power of -2.) Here 4242
is the so-called significant number, -2 is an exponent and 10 is an exponent base.
The same idea can be encoded in binary representation. You just deisgnate some bits to express significant part, some bits to express exponent and some bits to express base (or default your base).
In effect, binary representation of the floating point number could look something like this:
[5 bits to indicate how many bits for significant] [2 bits to
encode base] [significant bits] [exponent bits]
And this scheme allows one to encode much bigger numbers than integer encoding in the same amount of bits. Potentially, with 32 bits and above scheme one can encode numbers up to 10 ^ (2 ^ 25)! Much-much-much bigger number than one represented with simple 32-bit integer!
However, it has it's costs. The bigger (in modulus) or closer to zero the number becomes, the more bits are used for exponent (to indicate large power!) and the less number of bits is dedicated to significant. But with this you invariable lose precision - simply because there is a (very) finite number of numbers which can be represented with, say, eight bits.
That pretty much sums it up. The rest is the rules for producing the numbers, selecting base and exponents, rounding the representation and so forth.