2

Suppose n is a BigInteger and I want its first 5 digits. I'm thinking in 2 ways:

  1. Dividing n by 10 log10(n)-5 times (so only the first 5 digits would remain).
  2. Get the Substring(0, 5) of n's string.

I have no idea which one is faster or if there is another option that may be better than these.

I would like to hear considerations about it before I test these options (what do you think of them, if there is something better, etc.).

phuclv
  • 37,963
  • 15
  • 156
  • 475
Daniel
  • 7,357
  • 7
  • 32
  • 84
  • 2
    The first division/modulus option should outperform the string option, as creating strings (and taking substrings) tends to be costly. – Tim Biegeleisen Jun 10 '21 at 03:57
  • 1
    [this answer](https://stackoverflow.com/a/34052627/4406995) tested with strings and log10 and found the strings solution to be 3 times slower (although they were trying to do 2 different things) – Hayden Jun 10 '21 at 03:59
  • @Hayden, in the link you sent, could you understand Russ' answer? I don't see how he evaluates the number of digits of a BigInteger (and his answer seemed the fastest) since his function receives an integer. – Daniel Jun 10 '21 at 04:10
  • Russ's answer receives an integer for the *number* of the digit, the evaluated BigInteger is created from there. You can skip it and take the BigInteger from parameter. – Martheen Jun 10 '21 at 04:32
  • `BigInteger.TryFormat(Span ...` would avoid the string allocation. – Jeremy Lakeman Jun 10 '21 at 04:34
  • The fastest is likely to involve `BigInteger.TryWriteBytes()` & some math, so you can stop when the next byte can't cause the answer to round up. But is it worth it? – Jeremy Lakeman Jun 10 '21 at 04:46
  • Why don't you perf test your solutions? You've been waiting long enough for an answer that you could have tested it yourself by now? – Caius Jard Jun 10 '21 at 06:10
  • `int result = (int)(value / BigInteger.Pow(10, (int)(value.GetBitLength() * 0.30102999566398 - 4)));` returns `5` or `6` digits integer; if we have `6` digits divide it by `10`: `if (result >= 1_000_000 || result < -1_000_000) result /= 10;` – Dmitry Bychenko Jun 10 '21 at 06:18
  • @DmitryBychenko feel free to post it as an answer. I'd like to understand a bit of these magic numbers. – Daniel Jun 10 '21 at 06:37
  • As `BigInteger` is not optimized for this operation in the slightest, if you are actually interested in performing this operation quickly it might be worth considering maintaining these digits separately, with a type that *is* suited to reading off decimal digits quickly (`decimal`). Of course `decimal` can't hold all the digits a `BigInteger` can, but depending on the kind of calculation you're doing it may be possible to maintain/recalculate the first digits on every operation. – Jeroen Mostert Jun 10 '21 at 07:30

1 Answers1

4

If we want to find out first 5 leading digits, we can exploit integer division:

   123     / 1   == 123
   12345   / 1   == 12345
   1234567 / 100 == 12345

Naive approach is to divide original value by 10 while it is bigger or equal 1_000_000:

   1234567 => 123456 => 12345

However, division is expensive especially if we have a huge number (if we have a number with ~1000000 digits we have to divide this number ~1000000 times). To create a faster solution we can compute an appropriate power of 10 (power computation is fast) and then divide just once:

  1234567 / Pow(10, Log10(1234567) - 5 + 1)

So the raw idea is

  result = source / BigInteger.Pow(10, (int)BigInteger.Log10(source) - 4);

There are two difficulties here:

  1. Negative numbers (we should take absolute value of source at least when we computing logarithms)
  2. Rounding errors for huge source (what if we have rounding up and have just 4 digits?)

To cope with both problems I compute Log10 myself as Log10(2) * Log2(source):

  value.GetBitLength() * 0.30102999566398

this guarantees that I have at least 5 digits but may be 6 in case of rounding errors (note 0.30102999566398 < Log10(2)).

Combining all together:

private static int FirstDigits(BigInteger value) {
  if (value.IsZero)
    return 0; 

  int digits = 5;

  int result = (int)(value / 
    BigInteger.Pow(10, (int)(value.GetBitLength() * 0.30102999566398 - digits + 1)));

  while (result >= 1_000_000 || result <= -1_000_000)
    result /= 10;  

  return result;   
}
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215