I know how to calculate range,maximu, minimum value for int types data. Like short int, int, long int, char. For example if char is 1 byte for signed char the minimum value will be -2^(1byte*8-1) and maximum will be found by +2^(1byte*8-1) -1 Again range will be maximum-minimum+1. But with these formula i am not figuring out the maximum and minmum value for a float type. In c++ the minimum and maximum value for a float type is 3.4*10^(-38) and 3.4*10^(+38). Please someone helps me to know how to calculate the maximum, minimum value for floating types in simple and easy way. Because there may be the ans of this question asked in the past. But didn’t understand those definitio. So describe me so that i can understand easily
-
It's a little more complicated for floating point numbers, due to the way they are stored. https://stackoverflow.com/questions/21895756/why-are-floating-point-numbers-inaccurate has a pretty comprehensive explanation – Matt Jul 20 '19 at 05:52
-
1std::numeric_limits – Jonathan Potter Jul 20 '19 at 05:58
-
https://en.cppreference.com/w/cpp/types/numeric_limits – Shawn Jul 20 '19 at 06:00
2 Answers
For a double
the mantissa (aka the significand) is 53 bits and the exponent is 11 bits. Assuming we calculate the value of the floating point with the formula m*2^e
where m
is a 53 bit integer then the exponent range is [-1075,971]. These values are specified by the IEEE 754 standard.
So maximum value is
(2^53-1)*2^971
and smallest strictly positive value is
2^-1075
where ^
means to the power of.
I am assuming that the compiler uses the IEEE 754 standard, which isn't required by C++, but in practise will always be the case.

- 85,011
- 4
- 57
- 81
-
-
Let, double data has 8 byte . which is 64 bit. And you are telling 53 bit as significan. So remaining is 11 bit.from here how do you derived -1022 and 1023 – Jul 20 '19 at 06:10
-
@TusharAhmed 1023 and -1022 are the values specified by the IEEE 754 standard. – john Jul 20 '19 at 06:11
-
@TusharAhmed In a 64 bit double, there's 1 sign bit, 11 exponent bits, and 52 mantissa bits. But the mantissa is *normalised* so in effect there are 53 mantissa bits. – john Jul 20 '19 at 06:13
-
-
This answer discusses only IEEE-754 binary interchange formats.
First, we must understand the format in which floating-point numbers are encoded. IEEE-754 specifies that a binary floating-point number is represented with:
- a 1-bit sign S,
- a w-bit biased exponent e, and
- a p-bit significand f, which is primarily encoded with t = p−1 bits.
The exponent e is encoded by adding a bias, so the actual value E stored in the w bits is E = e + bias. The bias is specified to be 2k−p−1−1, where k is the width of the format (such as 32 for a 32-bit float
).1 The precision p is specified to be 11 and 24 for 16-bit and 32-bit widths and k-round(4•log2(k))+13 for other widths. Note that k−p = k−(p−1)-1 equals the width of the exponent field, w, as taking the entire encoding (k bits) and removing the significand encoding (p−1 bits) and the sign bit (1 bit) leaves just the exponent encoding, so the bias 2k−p−1−1 equals 2w−1−1.
The value of the exponent field that has all ones in its binary representation, 2w−1, is reserved for special purposes (NaN and ∞). So the maximum value the field can have for normal numbers is E = 2w−2. Then the maximum value the represented exponent can have is e = E − bias = (2w−2) − (2w−1−1) = 2w−1−1. (The maximum normal exponent value equals bias.) Also, the exponent field of zero is special, and e is specified to be 1-bias in this case.
The significand f is stored by putting its trailing p−1 bits in the significand field. The leading bit is inferred from the exponent field. If the exponent is not zero and is not the reserved value with all ones, then the significand f is specified to be 1 + T•21−p, where T is the binary number stored by the t bits in the significand field. Note that the largest value of the significand field, when all its bits are set, is 2p−1−1.
If the exponent is zero, the significand f is specified to be 0 + T•21−p.
When the exponent field does not have the special all-ones value or the zero value, the value represented by this encoding is (−1)S • 2e • f. When the exponent field is zero, the value represented is (−1)S • 21-bias • f.
Now we can figure out the minimum and maximum values. Of course, the minimum and maximum values representable in this format are −∞ and +∞, and the minimum magnitude is 0. But we are also interested in the minimum non-zero magnitude and the maximum finite number. (The minimum finite number is the negation of the maximum finite number.)
The maximum finite value occurs when the sign bit is zero, the exponent has its largest non-special value and the significand field has all its bits set. Then e = 2w−1−1, and T = 2p−1−1. So f = 1 + (2p−1−1)•21−p = 2 − 21−p, and the number represented is (−1)0 • 22w−1−1 • (2 − 21−p).
For the 32-bit width, w = 8 and p = 24, so the maximum value is 228−1−1 • (2 − 21−24) 2127 • (2 − 2−23) = 2128−2104.
The minimum non-zero magnitude occurs when the exponent encoding E has its minimum value, zero, and the significand encoding T has its minimum non-zero value, one. Then the exponent e = 1 − bias, and the significand f = 0 + T•21−p = 1•21−p = 21−p. The number represented is (−1)S • 21−bias • 21−p.
For the 32-bit format, bias = 127 and p = 24, so the minimum non-zero magnitude is 21−127 • 21−24 = 2−149.
Footnote
1 Only formats of widths 16, 32, 64, and multiples of 32 that are at least 128 are specified.

- 1
- 1

- 195,579
- 13
- 168
- 312