1

I've implemented a fixed Decimal class but I have an overflow problem caused by a propagation of precision when dividing.

At a high level the decimal is represented by:

template<class MANTISSA>
struct Decimal
{
                        //  Example
                        //  7.14
    MANTISSA mantissa_; //   714
    uint8_t exponent_;  //     2
};

However, I have a fundamental design flaw. My divide operator keeps calculating the remainder until it's zero or we run out of digits. The latter scenario occurs more-frequently than I realized.

I then multiply the output of divide by another decimal:

Decimal operator*(const Decimal& rhs) const
{
    Decimal result = *this;
    result.mantissa_ *= rhs.mantissa_;
    result.exponent_ += rhs.exponent_;
    return result;
}

There will most-likely be an overflow here because the mantissa from the divide output already has max digits (18 for int64_t).

To solve this I thought I could detect the overflow and reduce the multiplicands. I changed operator*() to check whether the exponents sum beyond 18. However, this is a bug. The exponents might be less than 18 but the mantissas can still be large enough to overflow.

To solve this properly I would need to count the number of mantissa digits prior to multiplying. And this requires a while loop, divide the mantissa by 10 until you reach zero and do this for both multiplicands- very expensive! Surely this cannot be the best approach??

At this point I have taken a step back because things seem complicated, How should I handle this? Do I fix the problem at source, don't let divide result in so many digits? Assuming so, do I require the user configure how many digits divides should output? Should the precision of divide output depend on the magnitude/exponent of the two input arguments?

Can anyone advise?

intrigued_66
  • 16,082
  • 51
  • 118
  • 189
  • 2
    Your multiply operator is broken; for normalized numbers you need to take the high half of a widening multiply and renormalize (which may involve a shift by 1 or not, or scaling by 10 for a decimal-float format). Assuming `mantissa_` is a normal integer type, you're taking the low half of the product. – Peter Cordes Dec 13 '22 at 23:10
  • 2
    "My divide operator keeps calculating the remainder until it's zero or we run out of digits." --> post that code. – chux - Reinstate Monica Dec 13 '22 at 23:15
  • 1
    IDK, I haven't looked at what libraries exist for decimal float. If your goal is to learn about FP math, then this sounds like a fun hobby project. If it's to get something performant and usable, I expect existing libraries are fairly good for most common widths, like 128-bit total size (IEEE decimal128 - https://en.wikipedia.org/wiki/Decimal128_floating-point_format) – Peter Cordes Dec 13 '22 at 23:23
  • @PeterCordes As you probably remember, I'm writing this myself to make-use of 128 bits. Would I just be better-off converting the C# Decimal class in to C++ with the GCC 128 integer? This is for production, not a learning exercise. – intrigued_66 Dec 13 '22 at 23:24
  • @PeterCordes I didn't see your message before I wrote my previous. – intrigued_66 Dec 13 '22 at 23:25
  • 1
    Do you really insist on 128-bit mantissa, instead of 128-bit total size with 110-bit mantissa? 128-bit mantissa would mean a lot of space wasted for alignment in a struct with 17 bytes of data. The wiki article discusses implementation choices of binary mantissa vs. BCD or similar decimal-based format for it. Without hardware support, a binary mantissa makes some operations a lot faster, but may make division very expensive; I haven't looked at exactly how FP division works to get a mantissa-width quotient with fractional bits instead of the remainder you get from integer division. – Peter Cordes Dec 13 '22 at 23:27
  • @PeterCordes I don't need 128 bit mantissa, but I definitely need more than 64. I have a situation where I am dividing numbers like 10 billion by 1e-10 or 1-12. This is what triggered the whole write-my-own-128-bit-decimal. So what you're effectively saying is, don't have exponent_ and base 10, try and stick to binary? – intrigued_66 Dec 13 '22 at 23:49
  • 1
    I'm saying that the exponent and sign bit can be fields in a 128-bit integer. And you get the mantissa by masking them away. That actual format of the mantissa bits is up to you; it could be binary integer, BCD, or DPD, or some other thing. – Peter Cordes Dec 14 '22 at 00:17
  • @PeterCordes okay, so pack everything within 128 bits, so I don't waste 15 bytes of padding? Then use masking to extract my integers and perform the same calculations I do now (normalizing where I need to)? Do you have any advise regarding the overflow i'm currently experiencing- how to calculate my divide to lower precision? – intrigued_66 Dec 14 '22 at 00:26
  • 1
    No, like I said I'm not sure what algorithm would be appropriate. I think for binary floating-point, you'd put the mantissa in the high half of the dividend (for narrowing division like x86 `div`; C doesn't have that operation, just like it's missing widening multiply). Maybe right shift by 1 to avoid division overflow if the normalize dividend mantissa is larger than the divider. So for decimal FP, maybe multiply by some power of 10 instead, like the number of decimal digits. But with a dividend wider than a register, maybe work a (decimal) digit at a time, long-division style? – Peter Cordes Dec 14 '22 at 03:32
  • 1
    Forgot to say in my last comment; I'm not sure about the details of getting correct rounding for the least decimal digit of the mantissa, which I guess is the point of a decimal-FP format. Or at least part of the point? A base-10 exponent does mean it can exactly represent decimal fractions like 0.123, even if you're using a binary mantissa? (Not just special cases like 0.125 (1/8) where the denominator is a power of 2 like for binary FP). – Peter Cordes Dec 14 '22 at 04:11
  • 1
    Anyway, if you're fine with a format where the mantissa is only *most* of a 128-bit integer, then you can use standard formats like IEEE decimal128, and look for libraries for that, instead of writing your own. (Or even compiler support.) – Peter Cordes Dec 14 '22 at 04:13
  • 1
    see this [iterative floating pooint divider](https://stackoverflow.com/a/18398246/2521214) for some inspiration on how to do division on floats ... If you look at edit history in there is also big num implementation (similar to what you trying to achieve) – Spektre Dec 14 '22 at 07:18

0 Answers0