10

I'm trying to determine the number of digits in a c# ulong number, i'm trying to do so using some math logic rather than using ToString().Length. I have not benchmarked the 2 approaches but have seen other posts about using System.Math.Floor(System.Math.Log10(number)) + 1 to determine the number of digits. Seems to work fine until i transition from 999999999999997 to 999999999999998 at which point, it i start getting an incorrect count.

Has anyone encountered this issue before ?

I have seen similar posts with a Java emphasis @ Why log(1000)/log(10) isn't the same as log10(1000)? and also a post @ How to get the separate digits of an int number? which indicates how i could possibly achieve the same using the % operator but with a lot more code

Here is the code i used to simulate this

Action<ulong> displayInfo = number => 
 Console.WriteLine("{0,-20} {1,-20} {2,-20} {3,-20} {4,-20}", 
  number, 
  number.ToString().Length, 
  System.Math.Log10(number), 
  System.Math.Floor(System.Math.Log10(number)),
  System.Math.Floor(System.Math.Log10(number)) + 1);

Array.ForEach(new ulong[] {
 9U,
 99U,
 999U,
 9999U,
 99999U,
 999999U,
 9999999U,
 99999999U,
 999999999U,
 9999999999U,
 99999999999U,
 999999999999U,
 9999999999999U,
 99999999999999U,
 999999999999999U,
 9999999999999999U,
 99999999999999999U,
 999999999999999999U,
 9999999999999999999U}, displayInfo);

Array.ForEach(new ulong[] {
 1U,
 19U,
 199U,
 1999U,
 19999U,
 199999U,
 1999999U,
 19999999U,
 199999999U,
 1999999999U,
 19999999999U,
 199999999999U,
 1999999999999U,
 19999999999999U,
 199999999999999U,
 1999999999999999U,
 19999999999999999U,
 199999999999999999U,
 1999999999999999999U
}, displayInfo);

Thanks in advance

Pat

Community
  • 1
  • 1
pmcgrath
  • 823
  • 2
  • 9
  • 21
  • 2
    You don’t need the `%` operator, integer division is enough. And the “lot more code” is a single-statement loop. Not all that much code. – Konrad Rudolph Feb 09 '11 at 19:20
  • 2
    Is ToString().Length really that expensive of a call? Maybe its a hair slower but it will be accurate for any value. To me, accuracy is more desirable than efficiency. – Jeff LaFay Feb 09 '11 at 19:22
  • 1
    It makes sense that you start getting incorrect results when using the `Log10` function, because that function uses the `double` type, which cannot represent very large integers exactly (see http://en.wikipedia.org/wiki/Double_precision_floating-point_format) – Justin Feb 09 '11 at 19:23
  • I agree, I don't see anything particularly wrong with using `ToString().Length`. Also, you can change your `Action` to a regular method, as long as the signature is correct. – Daniel T. Feb 09 '11 at 19:26
  • Please see http://stackoverflow.com/questions/4143000/find-the-string-length-of-an-int -- see the "cascading if" version if you want "speed" (can be broken into parts with a few divide similar to the iterative method). Otherwise might a well just `ToString` here. The `if` can be moved into a fast-break iteration of a List. –  Feb 09 '11 at 20:14
  • If you decide to use ToString please make sure to specify correct (invariant) culture - there are cultures where integers have separators in them like 100,000, as result ToString().Length will be wrong in such cultures. – Alexei Levenkov Feb 09 '11 at 21:26

4 Answers4

7

log10 is going to involve floating point conversion - hence the rounding error. The error is pretty small for a double, but is a big deal for an exact integer!

Excluding the .ToString() method and a floating point method, then yes I think you are going to have to use an iterative method but I would use an integer divide rather than a modulo.

Integer divide by 10. Is the result>0? If so iterate around. If not, stop. The number of digits is the number of iterations required.

Eg. 5 -> 0; 1 iteration = 1 digit.

1234 -> 123 -> 12 -> 1 -> 0; 4 iterations = 4 digits.

winwaed
  • 7,645
  • 6
  • 36
  • 81
  • 1
    Just a tease: A slight modification of this approach can actually be done with only *integer multiply* and compares. Looking at the lookup-table in my reply should shed some insight (no lookup-table required for the previous sentence). –  Feb 12 '11 at 00:20
  • 1
    Sure, do it in reverse. And this might be preferable if the multiply op is faster. – winwaed Feb 12 '11 at 04:12
  • It *is* faster. Question is if it's perceivable ;-) I just felt like leaving the comment because this struck me the other night -- after finishing my post -- and realized I've never seen it suggested as a solution for this problem before (that could mean one of several things...). –  Feb 12 '11 at 07:16
6

I would use ToString().Length unless you know this is going to be called millions of times.

"premature optimization is the root of all evil" - Donald Knuth

Gonzalo.-
  • 12,512
  • 5
  • 50
  • 82
John McDonald
  • 1,790
  • 13
  • 20
3

From the documentation:

By default, a Double value contains 15 decimal digits of precision, although a maximum of 17 digits is maintained internally.

I suspect that you're running into precision limits. Your value of 999,999,999,999,998 probably is at the limit of precision. And since the ulong has to be converted to double before calling Math.Log10, you see this error.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
0

Other answers have posted why this happens.

Here is an example of a fairly quick way to determine the "length" of an integer (some cases excluded). This by itself is not very interesting -- but I include it here because using this method in conjunction with Log10 can get the accuracy "perfect" for the entire range of an unsigned long without requiring a second log invocation.

// the lookup would only be generated once
// and could be a hard-coded array literal
ulong[] lookup = Enumerable.Range(0, 20)
    .Select((n) => (ulong)Math.Pow(10, n)).ToArray();
ulong x = 999;
int i = 0;
for (; i < lookup.Length; i++) {
    if (lookup[i] > x) {
        break;
    }
}
// i is length of x "in a base-10 string"
// does not work with "0" or negative numbers

This lookup-table approach can be easily converted to any base. This method should be faster than the iterative divide-by-base approach but profiling is left as an exercise to the reader. (A direct if-then branch broken into "groups" is likely quicker yet, but that's way too much repetitive typing for my tastes.)

Happy coding.

  • Making lookup table static and using binary search instead of direct iteration may speed up this solution a bit more. If need to be seed up even more rolling out binary search into series of if statements may be the fastest. – Alexei Levenkov Feb 09 '11 at 21:23