7

how will you decide what precision works best for your inference model? Both BF16 and F16 takes two bytes but they use different number of bits for fraction and exponent.

Range will be different but I am trying to understand why one chose one over other.

Thank you

    |--------+------+----------+----------|
    | Format | Bits | Exponent | Fraction |
    |--------+------+----------+----------|
    | FP32   |   32 |        8 |       23 |
    | FP16   |   16 |        5 |       10 |
    | BF16   |   16 |        8 |        7 |
    |--------+------+----------+----------|

Range
bfloat16: ~1.18e-38 … ~3.40e38 with 3 significant decimal digits.
float16:  ~5.96e−8 (6.10e−5) … 65504 with 4 significant decimal digits precision.

RedFox
  • 1,158
  • 2
  • 15
  • 28
  • 1
    I think float16 is used for gpu whereas bfloat16 is used for tpu mp during training. – Innat Oct 01 '21 at 00:28
  • @M.Innat Ampere GPUs support bfloat16: https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/a100-80gb-datasheet-update-nvidia-us-1521051-r2-web.pdf – MWB Oct 02 '21 at 03:59

1 Answers1

2

bfloat16 is generally easier to use, because it works as a drop-in replacement for float32. If your code doesn't create nan/inf numbers or turn a non-0 into a 0 with float32, then it shouldn't do it with bfloat16 either, roughly speaking. So, if your hardware supports it, I'd pick that.

Check out AMP if you choose float16.

MWB
  • 11,740
  • 6
  • 46
  • 91
  • Thanks. Wanted to know what you would pick if hardware supports both.. – RedFox Oct 04 '21 at 21:26
  • 1
    @RedFox `bfloat16`, as my answer mentions at the end. (You probably read it before I wrote that part) – MWB Oct 04 '21 at 23:51
  • Thanks. They do offer different range of numbers. Wondering if this range is the reason to chose one over other. ps: updated the question with range. – RedFox Oct 05 '21 at 20:44