Represent Integers with 2000 or more digits

Question

I would like to write a program, which could compute integers having more then 2000 or 20000 digits (for Pi's decimals). I would like to do in C++, without any libraries! (No big integer, boost,...). Can anyone suggest a way of doing it? Here are my thoughts:

using const char*, for holding the integer's digits;
representing the number like

( (1 * 10 + x) * 10  + x )...

I'd be tempted to use a `uint8_t` for each digit. Not particularly memory-friendly but will make arithmetic operations tractable. Or, consider the more difficult binary coded decimal (BCD) notation. — Bathsheba, Jul 23 '15 at 10:42
*Integers* won´t be useful for Pi. And why do you want no lib? — deviantfan, Jul 23 '15 at 10:42
1) actually, it isn't just for PI : also need to represent a number with 2001 digits of 1 like 11111111111... — , Jul 23 '15 at 10:43
What's stopping you from simply seeing how it's done in the libraries you don't want to use, and emulating that? — Benjamin Lindley, Jul 23 '15 at 10:47
And I didn't say you shouldn't. But if you're going to ask other people how you should do it, you might as well simply look how it's been done in the past, since that's the advice that people will give you on how to do it in your own library. — Benjamin Lindley, Jul 23 '15 at 10:50
Read a lot of papers and books on [bignums](https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic). It is a mathematically difficult subject, and you also have programming issues. Coding a competitive bignum library could take you dozens of years. — Basile Starynkevitch, Jul 23 '15 at 11:06

danielschemmel · Accepted Answer · 2015-07-23T11:46:13.073

3

The obvious answer works along these lines:

class integer {
    bool negative;
    std::vector<std::uint64_t> data;
};

Where the number is represented as a sign bit and a (unsigned) base 2**64 value.

This means the absolute value of your number is:

data[0] + (data[1] << 64) + (data[2] << 128) + ....

Or, in other terms you represent your number as a little-endian bitstring with words as large as your target machine can reasonably work with. I chose 64 bit integers, as you can minimize the number of individual word operations this way (on a x64 machine).

To implement Addition, you use a concept you have learned in elementary school:

  a           b
+ x           y
------------------
 (a+x+carry) (b+y reduced to one digit length)

The reduction (modulo 2**64) happens automatically, and the carry can only ever be either zero or one. All that remains is to detect a carry, which is simple:

bool next_carry = false;
if(x += y < y) next_carry = true;
if(prev_carry && !++x) next_carry = true;

Subtraction can be implemented similarly using a borrow instead. Note that getting anywhere close to the performance of e.g. libgmp is... unlikely.

edited Jul 23 '15 at 11:46

answered Jul 23 '15 at 11:02

danielschemmel

10,885
1
36
58

As I see it it would be even more obvious to use two's complement form. – Cheers and hth. - Alf Jul 23 '15 at 11:03
You advise to use 2^64 as base. Can you implement carry in multiplication without assembly then? – stgatilov Jul 23 '15 at 11:04
1

To accommodate numbers like pi this would also require a data member to stipulate the location of the decimal point. – underscore_d Jul 23 '15 at 11:05
@stgatilov Yes - you treat them as two 32 bit numbers and do a tiny school multiplication. – danielschemmel Jul 23 '15 at 11:05
@underscore_d The question is explicitly about integers. Making rationals, fixed or floating point representations when integers are available is pretty straight-forward though. – danielschemmel Jul 23 '15 at 11:07
Performance does not matter here, only the result :) – Jul 23 '15 at 11:35
Given that OP has no experience with bignums yet, I'm afraid he'll have problems with printing power-of-two numbers in decimal form... – stgatilov Jul 23 '15 at 11:36
1

Can you just show how you would overload the + and = operator ? – Jul 23 '15 at 11:39
@MattMatt I sketched it out for you, but any good answer would not really fit this format. – danielschemmel Jul 23 '15 at 11:46

stgatilov · Answer 2 · 2015-07-23T11:25:56.310

A long integer is usually represented by a sequence of digits (see positional notation). For convenience, use little endian convention: A[0] is the lowest digit, A[n-1] is the highest one. In general case your number is equal to sum(A[i] * base^i) for some value of base.

The simplest value for base is ten, but it is not efficient. If you want to print your answer to user often, you'd better use power-of-ten as base. For instance, you can use base = 10^9 and store all digits in int32 type. If you want maximal speed, then better use power-of-two bases. For instance, base = 2^32 is the best possible base for 32-bit compiler (however, you'll need assembly to make it work optimally).

There are two ways to represent negative integers, The first one is to store integer as sign + digits sequence. In this case you'll have to handle all cases with different signs yourself. The other option is to use complement form. It can be used for both power-of-two and power-of-ten bases.

Since the length of the sequence may be different, you'd better store digit sequence in std::vector. Do not forget to remove leading zeroes in this case. An alternative solution would be to store fixed number of digits always (fixed-size array).

The operations are implemented in pretty straightforward way: just as you did them in school =)

P.S. Alternatively, each integer (of bounded length) can be represented by its reminders for a set of different prime modules, thanks to CRT. Such a representation supports only limited set of operations, and requires nontrivial convertion if you want to print it.

No, bignum arithmetic operations are not straightforward. They are using very difficult, but efficient, algorithms (much better that high-school ones). Bignum algorithms are really hard! — Basile Starynkevitch, Jul 23 '15 at 11:23
@BasileStarynkevitch Indeed, fastest bignum libraries use [FFT](https://en.wikipedia.org/wiki/Fast_Fourier_transform) for multiplication and etc. However, straightforward operations are easy to implement and would work well. That should be enough for OP. Besides, I heavily doubt that builtin BigInt uses any FFT in Java. — stgatilov, Jul 23 '15 at 11:28

Represent Integers with 2000 or more digits

2 Answers2