90

for instance,

n = 3432, result 4

n = 45, result 2

n = 33215, result 5

n = -357, result 3

I guess I could just turn it into a string then get the length of the string but that seems convoluted and hack-y.

Ken White
  • 123,280
  • 14
  • 225
  • 444
willc2
  • 38,991
  • 25
  • 88
  • 99
  • 7
    Getting the string length would fail in case of negative numbers. So get the length of the absolute value instead. ;-) – Wim ten Brink Jul 01 '09 at 12:50
  • 1
    char buff[100]; int r = sprintf(buff,"%s",n) - (r<0); – paxdiablo Jul 01 '09 at 13:25
  • 1
    you mean decimal digits? decimal places are something that real numbers have, and integers don't, by definition. – Will Jul 01 '09 at 13:38
  • 1
    Uh ... Pax, is that a legal expression? Since r doesn't have a value before the assignment, the "(r < 0)" part seems scary. Or perhaps you meant that it should bne done as a second step, so it's just the notation that I'm not getting (I'm reading it as if it were C). – unwind Jul 01 '09 at 13:41
  • @Will, yeah I was going to say "return 0;" – Nosredna Jul 01 '09 at 13:42
  • You're right, @unwind, it should have been n, not r: char buff[100]; int r = sprintf(buff,"%s",n) - (n<0); – paxdiablo Jul 01 '09 at 13:58
  • @Will, 'decimal digits' it is. – willc2 Jul 01 '09 at 14:02
  • Actually the real flaw is the %s instead of %d. – jmucchiello Jul 02 '09 at 02:12
  • 3
    Must ... remember ... to ... unit ... test! char buff[100]; int r = sprintf(buff,"%d",n) - (n<0); – paxdiablo Jul 02 '09 at 04:33
  • +1 This has been a very fun and educational question ! – Tom Jul 02 '09 at 15:51
  • Many years later, reading this question, I notice that no one asked *why* you need the number of digits in a number. The usual reason is to allocate space when serializing a number to characters. If that is the reason, then *all* the following answers are inefficient. :) – Preston L. Bannister Aug 05 '20 at 18:02

20 Answers20

157

The recursive approach :-)

int numPlaces (int n) {
    if (n < 0) return numPlaces ((n == INT_MIN) ? INT_MAX: -n);
    if (n < 10) return 1;
    return 1 + numPlaces (n / 10);
}

Or iterative:

int numPlaces (int n) {
    int r = 1;
    if (n < 0) n = (n == INT_MIN) ? INT_MAX: -n;
    while (n > 9) {
        n /= 10;
        r++;
    }
    return r;
}

Or raw speed:

int numPlaces (int n) {
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n < 10) return 1;
    if (n < 100) return 2;
    if (n < 1000) return 3;
    if (n < 10000) return 4;
    if (n < 100000) return 5;
    if (n < 1000000) return 6;
    if (n < 10000000) return 7;
    if (n < 100000000) return 8;
    if (n < 1000000000) return 9;
    /*      2147483647 is 2^31-1 - add more ifs as needed
       and adjust this final return as well. */
    return 10;
}

Those above have been modified to better process MININT. On any weird systems that don't follow sensible 2n two's complement rules for integers, they may need further adjustment.

The raw speed version actually outperforms the floating point version, modified below:

int numPlaces (int n) {
    if (n == 0) return 1;
    return floor (log10 (abs (n))) + 1;
}

With a hundred million iterations, I get the following results:

Raw speed with 0:            0 seconds
Raw speed with 2^31-1:       1 second
Iterative with 2^31-1:       5 seconds
Recursive with 2^31-1:       6 seconds
Floating point with 1:       6 seconds
Floating point with 2^31-1:  7 seconds

That actually surprised me a little - I thought the Intel chips had a decent FPU but I guess general FP operations still can't compete with hand-optimized integer code.

Update following stormsoul's suggestions:

Testing the multiply-iterative solution by stormsoul gives a result of 4 seconds so, while it's much faster than the divide-iterative solution, it still doesn't match the optimized if-statement solution.

Choosing the arguments from a pool of 1000 randomly generated numbers pushed the raw speed time out to 2 seconds so, while it appears there may have been some advantage to having the same argument each time, it's still the fastest approach listed.

Compiling with -O2 improved the speeds but not the relative positions (I increased the iteration count by a factor of ten to check this).

Any further analysis is going to have to get seriously into the inner workings of CPU efficiency (different types of optimization, use of caches, branch prediction, which CPU you actually have, the ambient temperature in the room and so on) which is going to get in the way of my paid work :-). It's been an interesting diversion but, at some point, the return on investment for optimization becomes too small to matter. I think we've got enough solutions to have answered the question (which was, after all, not about speed).

Further update:

This will be my final update to this answer barring glaring errors that aren't dependent on architecture. Inspired by stormsoul's valiant efforts to measure, I'm posting my test program (modified as per stormsoul's own test program) along with some sample figures for all methods shown in the answers here. Keep in mind this is on a particular machine, your mileage may vary depending on where you run it (which is why I'm posting the test code).

Do with it as you wish:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <limits.h>
#include <time.h>

#define numof(a) (sizeof(a) / sizeof(a[0]))

/* Random numbers and accuracy checks. */

static int rndnum[10000];
static int rt[numof(rndnum)];

/* All digit counting functions here. */

static int count_recur (int n) {
    if (n < 0) return count_recur ((n == INT_MIN) ? INT_MAX : -n);
    if (n < 10) return 1;
    return 1 + count_recur (n / 10);
}

static int count_diviter (int n) {
    int r = 1;
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    while (n > 9) {
        n /= 10;
        r++;
    }
    return r;
}

static int count_multiter (int n) {
    unsigned int num = abs(n);
    unsigned int x, i;
    for (x=10, i=1; ; x*=10, i++) {
        if (num < x)
            return i;
        if (x > INT_MAX/10)
            return i+1;
    }
}

static int count_ifs (int n) {
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n < 10) return 1;
    if (n < 100) return 2;
    if (n < 1000) return 3;
    if (n < 10000) return 4;
    if (n < 100000) return 5;
    if (n < 1000000) return 6;
    if (n < 10000000) return 7;
    if (n < 100000000) return 8;
    if (n < 1000000000) return 9;
    /*      2147483647 is 2^31-1 - add more ifs as needed
    and adjust this final return as well. */
    return 10;
}

static int count_revifs (int n) {
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n > 999999999) return 10;
    if (n > 99999999) return 9;
    if (n > 9999999) return 8;
    if (n > 999999) return 7;
    if (n > 99999) return 6;
    if (n > 9999) return 5;
    if (n > 999) return 4;
    if (n > 99) return 3;
    if (n > 9) return 2;
    return 1;
}

static int count_log10 (int n) {
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n == 0) return 1;
    return floor (log10 (n)) + 1;
}

static int count_bchop (int n) {
    int r = 1;
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n >= 100000000) {
        r += 8;
        n /= 100000000;
    }
    if (n >= 10000) {
        r += 4;
        n /= 10000;
    }
    if (n >= 100) {
        r += 2;
        n /= 100;
    }
    if (n >= 10)
        r++;

    return r;
}

/* Structure to control calling of functions. */

typedef struct {
    int (*fnptr)(int);
    char *desc;
} tFn;

static tFn fn[] = {
    NULL,                              NULL,
    count_recur,    "            recursive",
    count_diviter,  "     divide-iterative",
    count_multiter, "   multiply-iterative",
    count_ifs,      "        if-statements",
    count_revifs,   "reverse-if-statements",
    count_log10,    "               log-10",
    count_bchop,    "          binary chop",
};
static clock_t clk[numof (fn)];

int main (int c, char *v[]) {
    int i, j, k, r;
    int s = 1;

    /* Test code:
        printf ("%11d %d\n", INT_MIN, count_recur(INT_MIN));
        for (i = -1000000000; i != 0; i /= 10)
            printf ("%11d %d\n", i, count_recur(i));
        printf ("%11d %d\n", 0, count_recur(0));
        for (i = 1; i != 1000000000; i *= 10)
            printf ("%11d %d\n", i, count_recur(i));
        printf ("%11d %d\n", 1000000000, count_recur(1000000000));
        printf ("%11d %d\n", INT_MAX, count_recur(INT_MAX));
    /* */

    /* Randomize and create random pool of numbers. */

    srand (time (NULL));
    for (j = 0; j < numof (rndnum); j++) {
        rndnum[j] = s * rand();
        s = -s;
    }
    rndnum[0] = INT_MAX;
    rndnum[1] = INT_MIN;

    /* For testing. */
    for (k = 0; k < numof (rndnum); k++) {
        rt[k] = (fn[1].fnptr)(rndnum[k]);
    }

    /* Test each of the functions in turn. */

    clk[0] = clock();
    for (i = 1; i < numof (fn); i++) {
        for (j = 0; j < 10000; j++) {
            for (k = 0; k < numof (rndnum); k++) {
                r = (fn[i].fnptr)(rndnum[k]);
                /* Test code:
                    if (r != rt[k]) {
                        printf ("Mismatch error [%s] %d %d %d %d\n",
                            fn[i].desc, k, rndnum[k], rt[k], r);
                        return 1;
                    }
                /* */
            }
        }
        clk[i] = clock();
    }

    /* Print out results. */

    for (i = 1; i < numof (fn); i++) {
        printf ("Time for %s: %10d\n", fn[i].desc, (int)(clk[i] - clk[i-1]));
    }

    return 0;
}

Remember that you need to ensure you use the correct command line to compile it. In particular, you may need to explicitly list the math library to get log10() working. The command line I used under Debian was gcc -o testprog testprog.c -lm.

And, in terms of results, here's the leader-board for my environment:

Optimization level 0:

Time for reverse-if-statements:       1704
Time for         if-statements:       2296
Time for           binary chop:       2515
Time for    multiply-iterative:       5141
Time for      divide-iterative:       7375
Time for             recursive:      10469
Time for                log-10:      26953

Optimization level 3:

Time for         if-statements:       1047
Time for           binary chop:       1156
Time for reverse-if-statements:       1500
Time for    multiply-iterative:       2937
Time for      divide-iterative:       5391
Time for             recursive:       8875
Time for                log-10:      25438
KraveXL
  • 305
  • 3
  • 12
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • 3
    Recursive version seems to me like the cleanest, simplest, best self-documenting solution posted. –  Jul 01 '09 at 13:06
  • @sharptooth: But recursive traversal of lists is standard practice in functional languages! – Brian Jul 01 '09 at 14:00
  • You can't assume you can get the abs of n if n is negative; if n is the minimum allowed value, you can't negate it. The opposite of -(2^31) is 2^31, which can't be represented as an int. – Matt Poush Jul 01 '09 at 14:52
  • @Brian: But C is not a functional language! – stormsoul Jul 01 '09 at 15:03
  • @Matt Poush: Yes, by the ISO C standard, the result of abs(MIN_INT) is undefined. But in reality, nearly all machines will return MIN_INT, which is the correct result, if you cast it to unsigned int. – stormsoul Jul 01 '09 at 15:07
  • @stormsoul: I wasn't trying to be that pedantic; I was assuming abs(MIN_INT) would return MIN_INT. The problem is that a number of the above functions assume that calling abs(MIN_INT) will return a positive number. For example, the iterative example negates a negative number, and then loops while the value is greater than 10. So, MIN_INT should return 10, but it returns 1, since it falls right through the while conditional. A better solution would be to create your own internal abs method that returns MAX_INT if the value is MIN_INT, or the actual abs value if the value isn't. – Matt Poush Jul 01 '09 at 15:22
  • What compiler did you use? Did you turn optimization on? I find it very hard to believe that a well-written iterative approach would be 5x slower than the unrolled version. Not with a good compiler. Also, a better measurement granularity would be much appreciated :) – stormsoul Jul 01 '09 at 15:35
  • @Matt Poush: Matt, it is sufficient to cast the result of abs() to unsigned int. The cast will flip MIN_INT to MAX_INT+1. – stormsoul Jul 01 '09 at 15:38
  • I have now read the code you tested. It is no wonder the iterative version you gave is slow, because it uses divisions, which are slow. Could you please time the multiplication-based iterative approach I have given? – stormsoul Jul 01 '09 at 15:41
  • gcc on Ubuntu 9.04, no particular flags so I don't know what the default optimization was (but it was in one executable anyway so they had the same optimization). The grnularity was 1 second (using time(0)) and averaged over 10 runs. The improvement was, I suspect, due to the fact that the raw speed version is doing only comparisons, not divisions. Your multiplication one may well do better (that was a nifty way around the speed issue but I can't test it now since the machine's in the bedroom and the wife's gone off to sleep). I'll give it a shot in the morning if you wish. – paxdiablo Jul 01 '09 at 15:42
  • Re the MININT problem, you could just add detection for that up front: (1) if they're power-of-two based, use "n = (n == MININT) ? MAXINT : -n". There's no power-of-two within one of a power-of ten so that would still return the right number of digits. (2) If they're wildly different (MAXINT = 100, MININT = -7), multiplex to two different raw-speed functions, one for +ve, the other for -ve. No doubt there are other solutions as well, they're just the ones I thought of off the top of my head. That particular problem affects all of the n=-n and n=abs(n) solutions. – paxdiablo Jul 01 '09 at 15:48
  • Sorry for comment-spamming, but I just got another idea ;). If you repeatedly time your function on THE SAME ARGUMENT, then no wonder your unrolled version will beat the asses out of every other. This is because the CPU will remember which branches are taken, and which are not, and rush through the code WITHOUT WAITING for the checks. Make an array of ~10000 random values (not more, to keep it in the L1 cache) and time the functions on that. – stormsoul Jul 01 '09 at 15:51
  • @Pax: You do not need to test for MININT - just cast it to unsigned int, which will give you the right result for NO cost. The (newer revisions of) ISO C standard guarantees integers are power-of-two based with a representation of signed integers being one of 3 possibilities allowed by the standard. The only one of those 3 possibilities where MININT != -MAXINT would be the most common one: the two's complement, where MININT == -(MAXINT+1). So the cast to unsigned int is pretty much a sure way to get the correct result everywhere. – stormsoul Jul 01 '09 at 16:06
  • I don't think you can cast blindly (Pax hesitates...) - in two's complement, -1 becomes 2^32-1 which is 10 digits long, not one as required. – paxdiablo Jul 01 '09 at 16:14
  • @Pax: GCC by default gives you no optimization. Pass -O2 or even -O3 to turn it on. If you additionally pass -funroll-loops you may even get the loop unrolled automatically by the compiler :). Also, optimization gives VERY different gains on different code, so it would change the ranking considerably. Your version and the float version would probably not gain much - if anything. The loops - quite on the contrary! – stormsoul Jul 01 '09 at 16:16
  • @Pax: Just HOW IN THE WORLD would you like abs() to return -1? The only negative value it can return is MIN_INT :). – stormsoul Jul 01 '09 at 16:18
  • @storm, I'm not talking about abs() returning -1, I'm stating that "int i = -1; unsinged int j = (unsigned int)i;" will set j to a big honkin' positive number. Or did you mean something else by your cast without cost? – paxdiablo Jul 01 '09 at 16:26
  • Never mind I think I got it it, you mean (unsigned int)(abs(i)) – paxdiablo Jul 01 '09 at 16:28
  • this is too premature optimization. it assumes that the user WOULD need to run this operation a hundred million times. if i'm going to run it only a few dozen times, the time spent optimizing this would be better spent elsewhere. – moogs Jul 02 '09 at 02:04
  • 6
    @moogs, you can use any of the solutions presented in this, or any other, answer here. The speed testing was really just an aside (that got out of hand). And in any case, you still have that time available to you - it's only *my* time that was possibly wasted here so feel free to use the fruits of my labor as you wish :-) – paxdiablo Jul 02 '09 at 02:34
  • You DO realize that rand() never returns negative values, do you? :) – stormsoul Jul 02 '09 at 10:34
  • 3
    A small performance tip - when you know some value is always non-negative, use unsigned types. They are slightly faster to multiply and divide. The compiler might guess for you that some variable is never negative and make this optimization automatically, but again, it might not. In more complex situations it never does. – stormsoul Jul 02 '09 at 10:46
  • 1
    Right. Someone in IRC made some performance tests, then he used unsigned and he god some really great boost. I urge you to try the unsigned world! :) – Johannes Schaub - litb Jul 02 '09 at 14:05
  • 2
    Nice answer =) I like the raw-speed version, but I think it can be improved by branching in a binary fashion to reduce the worst-case number of comparisons to four (disregarding the negative-test), obviously at the expense of readability. In that respect, one can tune a version of it specifically to the intended data range. – paddy Oct 26 '12 at 03:13
  • On my system, for that test snippet, you need to add `#include ` and `#include ` to the top, and the `log10` one doesn't work. – Nic May 12 '16 at 03:37
  • @QPaysTaxes, it compiled fine for me without `time.h` but, since it's good practice, I've added it in. The `stdio.h` one was already there. For the non-working `log10()`, it's probably due to not linking with the math library so I've added a note for that as well. Thanks for the heads-up. – paxdiablo May 12 '16 at 09:07
  • Huh, when I copied/pasted it, it didn't have stdio. I guess I did that wrong. Thanks for fixing it! – Nic May 12 '16 at 12:07
  • the native floating-point `log10` will be very slow. It's much better to use an [integer `log10` like this](https://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10). And the `abs` in the `log10` case won't work for `INT_MIN` like in your raw speed case – phuclv Mar 11 '18 at 02:09
119
floor (log10 (abs (x))) + 1

http://en.wikipedia.org/wiki/Logarithm

eduffy
  • 39,140
  • 13
  • 95
  • 92
  • 1
    I'd say ceil(...) instead of floor(...)+1. And of course a check for zero, as said. – Lennaert Jul 01 '09 at 12:49
  • 6
    @Len: `ceil` won't work for powers of 10: ceil(log10 (100)) == 2 – eduffy Jul 01 '09 at 12:57
  • 5
    My ancient understanding of C-type math libraries was that log and similar functions were fairly expensive, so I would tend to favor simple division, a la Pax's solution. Anyone know/comment? –  Jul 01 '09 at 13:08
  • @John, it is substantially slower but it doesn't really make a difference until you get into the hundred million iteration range (where the difference is seven seconds for floating point and 1 second for worst case optimized if statements). See my update. – paxdiablo Jul 01 '09 at 13:20
  • 14
    This would be needlessly slow. Don't use expensive functions such as log10() without a reason. The fast, integer function is simple enough to bother writing it. – stormsoul Jul 01 '09 at 14:31
  • 3
    Also, it would fail on x==0 AND x==INT_MIN, as usually abs(INT_MIN)==INT_MIN which is negative :). Cast the result of abs() to unsigned int to make the code slightly slower and more correct :). – stormsoul Jul 01 '09 at 15:00
  • 35
    Geez .. are you people still running an 8088? Who cares about few extra clock cycles. It took Paz 100,000,000 iterations to make a measurable difference, and even that was negligible! 6 seconds! Whoop-dee-do .. get on with your life. Next year it'll be 3 seconds. – eduffy Jul 01 '09 at 15:07
  • 12
    @eduffy: A millisecond here, a millisecond there... and suddenly, the user feels a noticable delay after clicking a button. Seriously, those small inefficiencies add up. Why waste clock cycles, when you don't gain anything by it? – stormsoul Jul 01 '09 at 15:30
  • 13
    @eduffy: if this were running on an embedded processor, there might not be any floating point support, let alone log functions, and the clock speed may only be in the tens of MHz - so an entirely integer-based solution would definitely be the preferred option. – Steve Melnikoff Jul 01 '09 at 16:12
  • 2
    In any case, we're mostly geeks here; surely faster is better, no matter what? ;-) – Steve Melnikoff Jul 01 '09 at 16:13
  • 11
    It turns out that although simple division is faster for small values, logarithm scales much better. If you call the division algorithms with every int from MIN_INT to MAX_INT (and repeat that the same 100m times as Paz's examples), you end up with an average of 13.337 seconds per call. Doing the same with Logarithm is an average of 8.143 seconds, the recursion takes 11.971 seconds, and the cascading If statements ends up taking an average of 0.953 seconds. So, the Daily-WTF-looking solution is an order of magnitude faster, but in the long run, this is in second place. – Matt Poush Jul 01 '09 at 18:55
  • 2
    Why not go with this if you think this is the clearest way (which in my opinion is), and if you really find it actually makes a difference you can easily switch to any of the other solutions, which are all small and nifty. – Skurmedel Jul 01 '09 at 19:34
  • @Matt Poush: What? Either you don't understand something, or have a really braindead compiler. You repeat the division at most n times, where n is the number of digits in the number (10 with 32-bit int). And the division is actually implemented as a multiplication by a good compiler. That's WAY faster than doing a single FP logarithm. By the way, ALL compilers are braindead if you don't turn optimization on :). – stormsoul Jul 02 '09 at 00:35
  • @stormsoul, are we still doing intensive calculations on a single thread? That was so last year. – Pyrolistical Jul 24 '09 at 21:51
  • 2
    @SteveMelnikoff But this isn't necessarily being run on an embedded processor. There are numerous restrictions if you go far enough into embedded systems so lets not just assume all code is embedded otherwise what's the point of all of these nice CPU features that Intel and AMD are pumping out? If you assume that every problem is a nail you naturally tend to pick the hammer to solve it! – nonsensickle Dec 31 '13 at 00:37
  • No need to compute log or divide at all with this see my edit in answer using this for ages – Spektre May 03 '15 at 09:40
  • 2
    My two cents on a 6 year old question: when doing thousands of these operations using parallel processes, those few seconds really add up in the long term for large data sets. – Blaise Nov 30 '15 at 14:28
  • @eduffy your comment is ignorant of the fact that he was asking a C question. It did not deserve 23 upvotes. C is being used today for more microprocessors than you would guess and you guessed wrong: the vast majority is MUCH MUCH slower than an 8088 which clocked up to 10MHZ. So for the sacke of reason, do not comment that on a general question. Such computers might not even have formatstring libraries, they likely won't even have a log() function available without writing it. Your comment would have been fine if it's about a modern PC program, but we don't know. – John Sep 29 '17 at 21:37
  • 3
    This doesn't work for 999999999999998, 999999999999999 and longer versions of these — even if we replace `abs` with `fabs` or `llabs` (without this replacement it couldn't work even in principle), at least on the common systems where `double` is IEEE 754 binary64. – Ruslan Apr 06 '20 at 19:08
  • 2
    My two cents on a now 12 year old question: Efficiency is important, but *how* important it is -- when we need to strive for speed, versus when we need to strive for convenience or obviousness or other virtues -- is endlessly subjective. Me, even on a high-powered processor, I would usually not haul out a full-boown floating-point logarithm just to compute the number of digits in an integer -- but at the same time, this *is* one way to do it, and it's a way that any competent programmer should know, so it was quite proper of eduffy to post it. And the upvotes prove that others agree! – Steve Summit Jun 18 '21 at 01:06
29

The shortest answer: snprintf(0,0,"%+d",n)-1

snprintf with n=0 does not store anything and allows a null buffer pointer; the return value is the number of characters that would have been written. The + modifier is used to print a sign (+ or -) even if the value is non-negative; subtracting one from the result discounts the sign from being counted as a digit.

Mustafa Aydın
  • 17,645
  • 4
  • 15
  • 38
R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 7
    `snprintf` with `n=0` does not store anything and allows a null buffer pointer; the return value is the number of characters that would have been written. The `+` modifier is used to print a sign (`+` or `-`) even if the value is non-negative; subtracting one from the result discounts the sign from being counted as a digit. – R.. GitHub STOP HELPING ICE Apr 13 '15 at 05:29
  • 1
    From my (Debian Linux) system's `man`-page on `snprintf`: _"Concerning the return value of `snprintf()`, __SUSv2__ and __C99__ contradict each other: when `snprintf()` is called with `size=0` [size is the second argument above] then SUSv2 stipulates an unspecified return value less than 1, while C99 allows str to be NULL in this case, and gives the return value (as always) as the number of characters that would have been written in case the output string has been large enough."_ {SUS = Single UNIX Specification} – SlySven Jul 22 '16 at 21:59
  • 6
    @SlySven: SUSv2 is ancient and irrelevant. – R.. GitHub STOP HELPING ICE Jul 22 '16 at 22:26
  • Well Debian are known for being somewhat conservative and not being the fastest to take on-board new stuff. 8-P Thanks for your answer - I have used it in a FOSS project I'm coding for (and attributed it to here accordingly)...! – SlySven Jul 24 '16 at 04:29
  • 2
    @SlySven: The doesn't come from Debian AFAIK, just the Linux man-pages project. I don't think there was ever a Linux libc with the wrong `snprintf` behavior, but there may have been a few ancient proprietary unices, and MSVC's `_snprintf` had the bug too. – R.. GitHub STOP HELPING ICE Jul 24 '16 at 15:38
  • This variant is not shorter, but uses less ink: `snprintf(0,0,"% d",n)-1` – chqrlie Feb 19 '17 at 02:31
  • This may give a wrong result because snprintf uses global locale. – vitaut Jun 04 '21 at 22:34
  • @vitaut: locale is not permitted to affect the printing of integers. – R.. GitHub STOP HELPING ICE Jun 05 '21 at 04:12
  • It very much is, see e.g. https://stackoverflow.com/questions/1449805/how-to-format-a-number-from-1123456789-to-1-123-456-789-in-c – vitaut Jun 05 '21 at 04:37
  • Ah, nevermind, it requires a separate specifier. – vitaut Jun 05 '21 at 04:42
26

Binary search pseudo algorithm to get no of digits of r in v..

if (v < 0 ) v=-v;

r=1;

if (v >= 100000000)
{
  r+=8;
  v/=100000000;
}

if (v >= 10000) {
    r+=4;
    v/=10000;
}

if (v >= 100) {
    r+=2;
    v/=100;
}

if( v>=10)
{
    r+=1;
}

return r;
Nathan Fellman
  • 122,701
  • 101
  • 260
  • 319
lakshmanaraj
  • 4,145
  • 23
  • 12
  • 4
    Why the downvote, this looks even more efficient than the divide-by-ten loops? – paxdiablo Jul 01 '09 at 13:37
  • 1
    Downvote was not from me, but I suspect it was because this is less readable than Wayne Shephard's variation (and probably slower). – Brian Jul 01 '09 at 14:04
  • 5
    I see, but I don't think it's right to downvote something for being *less* helpful - the popup clearly states "This answer is not helpful". In that case, I would upvote the other and leave this one alone. This was a genuine improvement over the /10 iteration. Still, it's positive now so no harm, no foul. (This isn't directed at you Brian since, as you already said, you didn't do it). Just food for thought for whoever did. – paxdiablo Jul 01 '09 at 14:12
  • 5
    My algorithm can be easily extended for longlong variable by having another if statement at the beginning if (v >= 10000000000000000LL) { r+=16; v/=10000000000000000LL; } and will be faster than all the approaches. – lakshmanaraj Jul 02 '09 at 07:11
12

Here is a very fast method to compute the number of decimal digits by Kendall Willets:

int count_digits(uint32_t n) {
#ifndef __has_builtin
#  define __has_builtin(x) 0
#endif
#if __has_builtin(__builtin_clz)
  // This increments the upper 32 bits (log10(T) - 1) when >= T is added.
  #  define K(T) (((sizeof(#T) - 1ull) << 32) - T)
  static const uint64_t table[] = {
      K(0),          K(0),          K(0),           // 8
      K(10),         K(10),         K(10),          // 64
      K(100),        K(100),        K(100),         // 512
      K(1000),       K(1000),       K(1000),        // 4096
      K(10000),      K(10000),      K(10000),       // 32k
      K(100000),     K(100000),     K(100000),      // 256k
      K(1000000),    K(1000000),    K(1000000),     // 2048k
      K(10000000),   K(10000000),   K(10000000),    // 16M
      K(100000000),  K(100000000),  K(100000000),   // 128M
      K(1000000000), K(1000000000), K(1000000000),  // 1024M
      K(1000000000), K(1000000000)                  // 4B
  };
  return (n + table[__builtin_clz(n | 1) ^ 31]) >> 32u;
#else
  int count = 1;
  for (;;) {
    if (n < 10) return count;
    if (n < 100) return count + 1;
    if (n < 1000) return count + 2;
    if (n < 10000) return count + 3;
    n /= 10000u;
    count += 4;
  }
  return count;
#endif
}

The fast path relies on __builtin_clz which is available in GCC and clang but thanks to the fallback that works reasonably well count_digits is fully portable.

This generates very efficient code (godbolt):

count_digits(unsigned int):
  mov edx, edi
  mov eax, edi
  or edx, 1
  bsr edx, edx
  movsx rdx, edx
  add rax, QWORD PTR count_digits(unsigned int)::table[0+rdx*8]
  shr rax, 32
  ret
vitaut
  • 49,672
  • 25
  • 199
  • 336
  • `#if defined(__has_builtin) && __has_builtin(__builtin_clz)` is NOT portable. – Greg A. Woods Jun 14 '21 at 03:41
  • Also your replacement for `int_log2_64()` is completely wrong for the variant *not* using `__builtin_clz()`! Perhaps it would be better if you just copy-pasted the original code instead of trying to hack it. – Greg A. Woods Jun 16 '21 at 20:31
  • @GregA.Woods why not? – vitaut Jun 17 '21 at 02:54
  • Why not _what_? – Greg A. Woods Jun 17 '21 at 04:25
  • Why do you think it's not portable? – vitaut Jun 17 '21 at 13:24
  • Because it doesn't work for some compilers that do not define `__has_builtin`. – Greg A. Woods Jun 17 '21 at 17:17
  • Now you should fix your replacement of a log2 algorithm with a log10 algorithm. – Greg A. Woods Jun 18 '21 at 17:03
  • I think you misunderstand the algorithm, please see the link for more details. – vitaut Jun 18 '21 at 19:42
  • 1
    Ah, yes, sorry -- you've mutated Willit's original code so much that I missed that you weren't trying to provide a more portable `int_log2_64()`, but were instead rather just replacing all of `count_digits()` for compilers without the builtin feature. – Greg A. Woods Jun 19 '21 at 00:10
  • If you can use `unsigned` to get it to zero-extend the BSR result instead of wasting a MOVSXD sign-extending it, that would be even better. Or if GCC is finicky, possibly us `unsigned long` / `__builtin_clzl` to use 64-bit operand-size on 64-bit systems. (But then you'd need to use `numeric_limits::digits - 1` instead of `31`. https://en.cppreference.com/w/cpp/types/numeric_limits/digits) – Peter Cordes Dec 13 '21 at 10:08
11

Divide by 10 in a loop until the result reaches zero. The number of iterations will correspond to the number of decimal digits.

Assuming that you expect to get 0 digits in a zero value:

int countDigits( int value )
{
    int result = 0;
    while( value != 0 ) {
       value /= 10;
       result++;
    }
    return result;
}
sharptooth
  • 167,383
  • 100
  • 513
  • 979
  • floor(log10(abs(x)))+1 would be faster, but eduffy has already suggested that! :-) – Wim ten Brink Jul 01 '09 at 12:49
  • I'd be curious to see that timed. I'd almost think that an optimized series of if statements (based on maxint) may outperform a floating point logarithm (but I'm too lazy to test it myself). – paxdiablo Jul 01 '09 at 12:52
  • It's never going to reach zero, is it? –  Jul 01 '09 at 12:53
  • @John Pirie: Why wouldn't it? I mean integer division and when applied iteratively to the same variable it will eventually give zero. – sharptooth Jul 01 '09 at 12:55
  • @JP, if you keep dividing an integer by 10, it will reach zero eventually. – paxdiablo Jul 01 '09 at 12:56
  • @sharp/Pax, doh, was thinking in doubles, you're absolutely right, thanks. –  Jul 01 '09 at 13:09
  • Actually, @Workshop, the floating point solution *isn't* faster but on par with this iterative solution. It's slower than the hand-optimized if-statement solution by a factor of about seven. – paxdiablo Jul 01 '09 at 14:08
  • This is SLOW, because you use divisions, which are slow (~7-10 times slower than multiplications, latency-wise). Instead of dividing the value, multiply the threshold you compare it to. – stormsoul Jul 01 '09 at 14:38
  • @me: Alright, these are divisions by a small constant, so they're only ~2-5 times slower (on a good optimizing compiler). Still, the main point I made holds :). – stormsoul Jul 01 '09 at 14:44
  • A good optimizing compiler replaces the division by constant with a multiplication and a shift and this stops being so slow. – sharptooth Jul 02 '09 at 13:13
  • @sharptooth: That's what I was thinking about. But the multiplication by a small constant will get replaced by a few simple shifts & adds, which is still faster than a multiply. – stormsoul Jul 02 '09 at 15:11
9

Constant-cost version that uses x86 assembly and a lookup table:

int count_bsr(int i) {
    struct {
            int max;
            int count;
    } static digits[32] = {
            { 9, 1 }, { 9, 1 }, { 9, 1 }, { 9, 1 },
            { 99, 2 }, { 99, 2 }, { 99, 2 },
            { 999, 3 }, { 999, 3 }, { 999, 3 },
            { 9999, 4 }, { 9999, 4 }, { 9999, 4 }, { 9999, 4 },
            { 99999, 5 }, { 99999, 5 }, { 99999, 5 },
            { 999999, 6 }, { 999999, 6 }, { 999999, 6 },
            { 9999999, 7 }, { 9999999, 7 }, { 9999999, 7 }, { 9999999, 7 },
            { 99999999, 8 }, { 99999999, 8 }, { 99999999, 8 },
            { 999999999, 9 }, { 999999999, 9 }, { 999999999, 9 },
            { INT_MAX, 10 }, { INT_MAX, 10 }
    };
        register const int z = 0;
        register unsigned log2;
        if (i < 0) i = -i;
        __asm__ __volatile__ (
                "bsr %1, %0;"  \
                "cmovz %2, %0;"\
                : "=r" (log2)  \
                : "rm" (i), "r"(z));
        return digits[log2].count + ( i > digits[log2].max );
}

Another one, with a smaller lookup table and a log10 approximation taken from here.

int count_bsr2( int i ) {
    static const unsigned limits[] =
            {0, 10, 100, 1000, 10000, 100000,
             1000000, 10000000, 100000000, 1000000000};
        register const int z = 0;
        register int l, log2;
        if (i < 0) i = -i;
        __asm__ __volatile__ (
                "bsr %1, %0;"  \
                "cmovz %2, %0;"\
                : "=r" (log2)  \
                : "rm" (i), "r"(z));
       l = (log2 + 1) * 1233 >> 12;
       return (l + ((unsigned)i >= limits[l]));
}

Both of these take advantage of the fact that on x86 -INT_MIN is equal to INT_MIN.

Update:

As per suggestion here are the timings for the count_bsr and a slightly faster 64-bit only count_bsr_mod routines compared to the binary search and binary chop algos using very nice paxdiablo's test program modified to generate sets with a random sign distribution. Tests were built with gcc 4.9.2 using "-O3 -falign-functions=16 -falign-jumps=16 -march=corei7-avx" options and executed on an otherwise quiescent Sandy Bridge system with turbo and sleep states off.

Time for               bsr mod:     270000  
Time for                   bsr:     340000  
Time for           binary chop:     800000  
Time for         binary search:     770000  
Time for     binary search mod:     470000  

Source for the test,

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <limits.h>
#include <time.h>

#define numof(a) (sizeof(a) / sizeof(a[0]))

/* Random numbers and accuracy checks. */

static int rndnum[10000];
static int rt[numof(rndnum)];

/* All digit counting functions here. */

static int count_bchop (int n) {
    int r = 1;
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n >= 100000000) {
        r += 8;
        n /= 100000000;
    }
    if (n >= 10000) {
        r += 4;
        n /= 10000;
    }
    if (n >= 100) {
        r += 2;
        n /= 100;
    }
    if (n >= 10)
        r++;

    return r;
}

static int count_bsearch(int i)
{
    if (i < 0)
    {
        if (i == INT_MIN)
            return 11; // special case for -2^31 because 2^31 can't fit in a two's complement 32-bit integer
        i = -i;
    }
    if              (i < 100000) {
        if          (i < 1000) {
            if      (i < 10)         return 1;
            else if (i < 100)        return 2;
            else                     return 3;
        } else {
            if      (i < 10000)      return 4;
            else                     return 5;
        }
    } else {
        if          (i < 10000000) {
            if      (i < 1000000)    return 6;
            else                     return 7;
        } else {
            if      (i < 100000000)  return 8;
            else if (i < 1000000000) return 9;
            else                     return 10;
        }
    }
}

// Integer log base 10, modified binary search.
static int count_bsearch_mod(int i) {
   unsigned x = (i >= 0) ? i : -i;
   if (x > 99)
      if (x > 999999)
         if (x > 99999999)
            return 9 + (x > 999999999);
         else
            return 7 + (x > 9999999);
      else
         if (x > 9999)
            return 5 + (x > 99999);
         else
            return 3 + (x > 999);
   else
         return 1 + (x > 9);
}

static int count_bsr_mod(int i) {
    struct {
            int m_count;
            int m_threshold;
    } static digits[32] =
    {
      { 1, 9 }, { 1, 9 }, { 1, 9 }, { 1, 9 },
      { 2, 99 }, { 2, 99 }, { 2, 99 },
      { 3, 999 }, { 3, 999 }, { 3, 999 },
      { 4, 9999 }, { 4, 9999 }, { 4, 9999 }, { 4, 9999 },
      { 5, 99999 }, { 5, 99999 }, { 5, 99999 },
      { 6, 999999 }, { 6, 999999 }, { 6, 999999 },
      { 7, 9999999 }, { 7, 9999999 }, { 7, 9999999 }, { 7, 9999999 },
      { 8, 99999999 }, { 8, 99999999 }, { 8, 99999999 },
      { 9, 999999999 }, { 9, 999999999 }, { 9, 999999999 },
      { 10, INT_MAX }, { 10, INT_MAX }
    };
        __asm__ __volatile__ (
            "cdq                    \n\t"
            "xorl %%edx, %0         \n\t"
            "subl %%edx, %0         \n\t"
            "movl %0, %%edx         \n\t"
            "bsrl %0, %0            \n\t"
            "shlq $32, %%rdx        \n\t"
            "movq %P1(,%q0,8), %q0  \n\t"
            "cmpq %q0, %%rdx        \n\t"
            "setg %%dl              \n\t"
            "addl %%edx, %0         \n\t"
                : "+a"(i)
                : "i"(digits)
                : "rdx", "cc"
        );
    return i;
}

static int count_bsr(int i) {
    struct {
            int max;
            int count;
    } static digits[32] = {
            { 9, 1 }, { 9, 1 }, { 9, 1 }, { 9, 1 },
            { 99, 2 }, { 99, 2 }, { 99, 2 },
            { 999, 3 }, { 999, 3 }, { 999, 3 },
            { 9999, 4 }, { 9999, 4 }, { 9999, 4 }, { 9999, 4 },
            { 99999, 5 }, { 99999, 5 }, { 99999, 5 },
            { 999999, 6 }, { 999999, 6 }, { 999999, 6 },
            { 9999999, 7 }, { 9999999, 7 }, { 9999999, 7 }, { 9999999, 7 },
            { 99999999, 8 }, { 99999999, 8 }, { 99999999, 8 },
            { 999999999, 9 }, { 999999999, 9 }, { 999999999, 9 },
            { INT_MAX, 10 }, { INT_MAX, 10 }
    };
        register const int z = 0;
        register unsigned log2;
        if (i < 0) i = -i;
        __asm__ __volatile__ (
                "bsr %1, %0;"  \
                "cmovz %2, %0;"\
                : "=r" (log2)  \
                : "rm" (i), "r"(z));
        return digits[log2].count + ( i > digits[log2].max );
}

/* Structure to control calling of functions. */

typedef struct {
    int (*fnptr)(int);
    const char *desc;
} tFn;

static tFn fn[] = {
 {   NULL,                              NULL },
 {   count_bsr_mod,  "              bsr mod" },
 {   count_bsr,      "                  bsr" },
 {   count_bchop,    "          binary chop" },
 {   count_bsearch,  "        binary search" },
 {   count_bsearch_mod,"    binary search mod"}
};
static clock_t clk[numof (fn)];

int main (int c, char *v[]) {
    int i, j, k, r;
    int s = 1;

    /* Test code:
        printf ("%11d %d\n", INT_MIN, count_bsearch(INT_MIN));
        //for (i = -1000000000; i != 0; i /= 10)
        for (i = -999999999; i != 0; i /= 10)
            printf ("%11d %d\n", i, count_bsearch(i));
        printf ("%11d %d\n", 0, count_bsearch(0));
        for (i = 1; i != 1000000000; i *= 10)
            printf ("%11d %d\n", i, count_bsearch(i));
        printf ("%11d %d\n", 1000000000, count_bsearch(1000000000));
        printf ("%11d %d\n", INT_MAX, count_bsearch(INT_MAX));
    return 0;
    /* */

    /* Randomize and create random pool of numbers. */

    int p, n;
    p = n = 0;
    srand (time (NULL));
    for (j = 0; j < numof (rndnum); j++) {
        rndnum[j] = ((rand() & 2) - 1) * rand();
    }
    rndnum[0] = INT_MAX;
    rndnum[1] = INT_MIN;

    /* For testing. */
    for (k = 0; k < numof (rndnum); k++) {
        rt[k] = (fn[1].fnptr)(rndnum[k]);
    }

    /* Test each of the functions in turn. */

    clk[0] = clock();
    for (i = 1; i < numof (fn); i++) {
        for (j = 0; j < 10000; j++) {
            for (k = 0; k < numof (rndnum); k++) {
                r = (fn[i].fnptr)(rndnum[k]);
                /* Test code:
                    if (r != rt[k]) {
                        printf ("Mismatch error [%s] %d %d %d %d\n",
                            fn[i].desc, k, rndnum[k], rt[k], r);
                        return 1;
                    }
                /* */
            }
        }
        clk[i] = clock();
    }

    /* Print out results. */

    for (i = 1; i < numof (fn); i++) {
        printf ("Time for %s: %10d\n", fn[i].desc, (int)(clk[i] - clk[i-1]));
    }

    return 0;
}
  • 1
    +1 for the most geeky answer. You should add performance figures to show how well it performs, especially compared to binary chop. – CodeManX Aug 16 '15 at 21:04
  • There is a bug in `count_bsearch()`: for the OP's semantics, it should return `10` for `i == INT_MIN`. – chqrlie Feb 19 '17 at 03:45
  • `-i` has undefined behaviour for INT_MIN, for signed `int i`. Use `unsigned absval = 0U - i` (or `i` if positive) to avoid it in C but still compile efficiently to the same asm for the negation. Unless you compile with `-fwrapv`, it's more of a "happens to work" situation than fully safely inheriting the behaviour of the ISA you're targeting. – Peter Cordes Dec 20 '20 at 05:06
7

You can do: floor (log10 (abs (x))) + 1 Or if you want to save on cycles you could just do comparisons

if(x<10)
  return 1;
if(x<100)
  return 2;
if(x<1000)
  return 3;
etc etc

This avoids any computationally expensive functions such as log or even multiplication or division. While it is inelegant this can be hidden by encapsulating it into a function. It isn't complex or difficult to maintain so I would not dismiss this approach on account of poor coding practice; I feel to do so would be throwing the baby out with the bath water.

Howard May
  • 6,639
  • 9
  • 35
  • 47
  • 1
    or I could just throw up a dialog box and ask the user, heh – willc2 Jul 01 '09 at 13:26
  • 2
    And why the downvote here? This turns out to be blindingly fast. – paxdiablo Jul 01 '09 at 13:37
  • the log function would have to be pretty bad if this solution is faster for the general case – David Cournapeau Jul 02 '09 at 16:21
  • 2
    @David: Off the top of my head, logarithms take somewhere around 250-700 cycles depending on the cpu. Even if you figure each branch in this answer takes 25 cycles, you'd need 10-30 digits before it got slower than a logarithm, and that's the worst case. If your typical numbers are small, it's even better. – R.. GitHub STOP HELPING ICE May 18 '12 at 01:56
7

Here is an unrolled binary search without any division or multiplication. Depending on the distribution of numbers given to it, it may or may not beat the other ones done with unrolled if statements, but should always beat the ones that use loops and multiplication/division/log10.

With a uniform distribution of random numbers encompassing the whole range, on my machine it averaged 79% of the execution time of paxdiablo's count_bchop(), 88% the time of count_ifs(), and 97% of the time of count_revifs().

With an exponential distribution (the probability of a number having n digits is equal to the that of it having m digits, where mn) count_ifs() and count_revifs() both beat my function. I'm not sure why at this point.

int count_bsearch(int i)
{
    if (i < 0)
    {
        if (i == INT_MIN)
            return 10; // special case for -2^31 because 2^31 can't fit in a two's complement 32-bit integer
        i = -i;
    }
    if              (i < 100000) {
        if          (i < 1000) {
            if      (i < 10)         return 1;
            else if (i < 100)        return 2;
            else                     return 3;
        } else {
            if      (i < 10000)      return 4;
            else                     return 5;
        }
    } else {
        if          (i < 10000000) {
            if      (i < 1000000)    return 6;
            else                     return 7;
        } else {
            if      (i < 100000000)  return 8;
            else if (i < 1000000000) return 9;
            else                     return 10;
        }
    }
}
Deadcode
  • 860
  • 9
  • 15
  • 1
    That's funny... I wrote a comment about doing exactly this just before, after seeing the 'raw speed' version in paxdiablo's answer. Then I discovered you had written this answer about 15 minutes earlier. Oh well, +1 =) Note that you can change the boundaries to tweak the function's performance in favour of particular data ranges. – paddy Oct 26 '12 at 03:52
  • You've gotta be kidding me! What are the odds? All the other answers were posted over 3 years ago. Our stories are even a bit similar. I started programming in BASIC on an IBM XT when I was 8 years old. – Deadcode Oct 26 '12 at 03:58
  • I was looking at the "active posts" list. This showed up and looked interesting. I got down to paxdiablo's post, made a comment, then wandered off... Came back later and saw another modification so I got curious. It was yours. Do you think we're mutual doppelgangers? – paddy Oct 26 '12 at 04:04
  • There is a bug in `count_bsearch()`: for the OP's semantics, it should return `10` for `i == INT_MIN`. – chqrlie Feb 19 '17 at 03:40
6

From Bit Twiddling Hacks :

Find integer log base 10 of an integer the obvious way

Note the ordering of comparisons in it.

fa.
  • 2,456
  • 16
  • 17
5

I stumbled across this during a google search: http://web.archive.org/web/20190108211528/http://www.hackersdelight.org/hdcodetxt/ilog.c.txt

A quick benchmark clearly showed the binary search methods winning. lakshmanaraj's code is quite good, Alexander Korobka's is ~30% faster, Deadcode's is a tiny bit faster still (~10%), but I found the following tricks from the above link give a further 10% improvement.

// Integer log base 10, modified binary search.
int ilog10c(unsigned x) {
   if (x > 99)
      if (x < 1000000)
         if (x < 10000)
            return 3 + ((int)(x - 1000) >> 31);
         // return 3 - ((x - 1000) >> 31);              // Alternative.
         // return 2 + ((999 - x) >> 31);               // Alternative.
         // return 2 + ((x + 2147482648) >> 31);        // Alternative.
         else
            return 5 + ((int)(x - 100000) >> 31);
      else
         if (x < 100000000)
            return 7 + ((int)(x - 10000000) >> 31);
         else
            return 9 + ((int)((x-1000000000)&~x) >> 31);
         // return 8 + (((x + 1147483648) | x) >> 31);  // Alternative.
   else
      if (x > 9)
            return 1;
      else
            return ((int)(x - 1) >> 31);
         // return ((int)(x - 1) >> 31) | ((unsigned)(9 - x) >> 31);  // Alt.
         // return (x > 9) + (x > 0) - 1;                             // Alt.
}

Note this is log 10, not number of digits, so digits = ilog10c(x)+1.

Doesn't support negatives, but that's easily fixed with a -.

Greg A. Woods
  • 2,663
  • 29
  • 26
jozxyqk
  • 16,424
  • 12
  • 91
  • 180
4
if (x == MININT) return 10;  //  abs(MININT) is not defined
x = abs (x);
if (x<10) return 1;
if (x<100) return 2;
if (x<1000) return 3;
if (x<10000) return 4;
if (x<100000) return 5;
if (x<1000000) return 6;
if (x<10000000) return 7;
if (x<100000000) return 8;
if (x<1000000000) return 9;
return 10; //max len for 32-bit integers

Very inelegant. But quicker than all the other solutions. Integer Division and FP logs are expensive to do. If performance isn't an issue, the log10 solution is my favorite.

Wayne Sheppard
  • 546
  • 3
  • 5
  • That actually turns out to be the fastest method even for the worst case (2^32-1) - see my update fo timings. – paxdiablo Jul 01 '09 at 13:33
  • 13
    I often suspect that "code smell" is a term trotted out by people who just don't like the code - it seems a very unscientific term. This code is perfectly readable (to me at least and to anyone else with half a brain if you add one simple comment line) and will outperform any other solution listed here (very important in the environment I was forged in). And the algorithm is scalable at O(log n) and portable if you just add more if statements to suit the environment you're working in. – paxdiablo Jul 01 '09 at 14:27
  • 3
    The question is tagged C and math. Any solution is welcome, even the fastest. – Ben Jul 01 '09 at 14:44
  • @Pax: Actually, making it a loop should not make it significantly slower (repeatedly multiply the threshold by 10), and will make it more compact. As a bonus, it would be perfectly portable to any possible sizeof(int) when you limit it by MAX_INT or such. – stormsoul Jul 01 '09 at 14:48
  • 2
    It's fast because it is just one compare per digit. The Iterative solutions are one compare, one division, and one increment per digit. Integer division is expensive, 17 cycles on a C2D. log10 is well over 100 cycles. – Wayne Sheppard Jul 01 '09 at 15:27
  • @Wayne Sheppard: You do not need to do divisions with an iterative solution - look at my answer. What's more, the multiplication (which the compiler would transform to 2 shifts and 1 add - 2 cycles total) can be done IN PARALLEL with the increment and compare - assuming the branch after the compare is correctly predicted. This gives 2 cycles / iteration. – stormsoul Jul 01 '09 at 16:44
  • @stormsoul, your loop never exits, right? Your test condition is: x<=INT_MAX/10*10 (or x <=2,147,483,640 ) After x = 1,000,000,000, x *= 10 overflows, wrapping back to negative. Comparing any integer against MAX_INT is kinda pointless. That said, your multiplication iteration works faster than using division. You just need to figure out how to terminate the loop correctly. – Wayne Sheppard Jul 01 '09 at 20:28
  • @Wayne Sheppard: Foo. You're right. Stupid me. I forgot I'm multiplying by 10, and somehow was thinking that the additional bit of precision given by unsigned would be enough - daydreaming, or what? :). Corrected. Thx! BTW, the loop exits, though with a wrong result :). After overflow x would go up until it ended within the (INT_MAX/10*10, UINT_MAX] interval. It would eventually get there, after at most few overflows :). – stormsoul Jul 02 '09 at 00:24
2
    int n = 437788;
    int N = 1; 
    while (n /= 10) N++; 
xcramps
  • 1,203
  • 1
  • 9
  • 9
2

Just a little adjust for C language:

floor( log10( abs( (number)?number:1 ) ) + 1 );
1

Since no-one mentioned, the less than 10^ can be done with SIMD. Here is an implementation with an eve library for sse2, avx2 and arm-v8.

https://godbolt.org/z/bscr3MWr4

I don't know how fast this is, though AVX-2 looks quite nice

count_digits(int):                      # @count_digits(int)
        vmovd   xmm0, edi
        vpbroadcastd    ymm0, xmm0
        vmovdqa ymm1, ymmword ptr [rip + .LCPI0_0] # ymm1 = [10,100,1000,10000,100000,1000000,10000000,100000000]
        vpcmpgtd        ymm0, ymm1, ymm0
        vmovmskps       ecx, ymm0
        bsf     edx, ecx
        add     edx, 1
        xor     esi, esi
        cmp     edi, 1000000000
        setl    sil
        mov     eax, 10
        sub     eax, esi
        test    cl, cl
        cmovne  eax, edx
        vzeroupper
        ret
Denis Yaroshevskiy
  • 1,218
  • 11
  • 24
0

DON'T use floor(log10(...)). These are floating-point functions, and slow ones, to add. I believe the fastest way would be this function:

int ilog10(int num)
{
   unsigned int num = abs(num);
   unsigned int x, i;
   for(x=10, i=1; ; x*=10, i++)
   {
      if(num < x)
         return i;
      if(x > INT_MAX/10)
         return i+1;
   }
}

Note that the binary search version some people suggested could be slower due to branch mispredictions.

EDIT:

I did some testing, and got some really interesting results. I timed my function together with all the functions tested by Pax, AND the binary search function given by lakshmanaraj. The testing is done by the following code snippet:

start = clock();
for(int i=0; i<10000; i++)
   for(int j=0; j<10000; j++)
      tested_func(numbers[j]);
end = clock();
tested_func_times[pass] = end-start;

Where the numbers[] array contains randomly generated numbers over the entire range of the int type (barring MIN_INT). The testing was repeated for each tested function on THE SAME numbers[] array. The entire test was made 10 times, with results averaged over all passes. The code was compiled with GCC 4.3.2 with -O3 optimization level.

Here are the results:

floating-point log10:     10340ms
recursive divide:         3391ms
iterative divide:         2289ms
iterative multiplication: 1071ms
unrolled tests:           859ms
binary search:            539ms

I must say I got really astonished. The binary search performed far better than I thought it would. I checked out how GCC compiled this code to asm. O_O. Now THIS is impressive. It got optimized much better than I thought possible, avoiding most branches in really clever ways. No wonder it is FAST.

stormsoul
  • 476
  • 2
  • 5
  • Well, the fastest way turns out to be unrolling that loop into hand-optimized if statements. But you're dead right about the slowness of floating point. – paxdiablo Jul 01 '09 at 14:34
  • @Pax: Floating point being slower than integer is one thing, and log10() and floor() being VERY slow is another. I was referring to the latter. – stormsoul Jul 01 '09 at 15:12
  • I'm going to vote this one up for the clever use of multiplication on the threshold rather than division on the value. I suppose you *could* invent a CPU where division was faster but I don't think I've ever seen one, and I'd probably fire the engineer that did it :-) – paxdiablo Jul 01 '09 at 15:53
  • Leave it with me, @stormsoul, I'll get back to you in about 8 hours (it's midnight here in Oz). – paxdiablo Jul 01 '09 at 15:54
  • @Pax: You couldn't. The division operation is inherently much more complex (and much more *sequential*) than multiplication - making any implementation much slower than what is possible with mults. Incidentally, an optimizing compiler would not emit any divisions for the "division" code! It would transform it into a multiplication by reciprocal. Because the divisor is a small constant, the reciprocal can be computed at compile-time. It would not emit any multiplications for the "multiplication" code, either. This would be transformed into two shifts and one add - for a total of 2 clock cycles. – stormsoul Jul 01 '09 at 16:36
  • This multiply-iterative solution bested the divide-iterative one by a factor of two - that's pretty impressive. – paxdiablo Jul 02 '09 at 00:29
0

I guess, the simplest way would be:

 int digits = 0;
if (number < 0) digits = 1;
while (number) {
    number /= 10;
    digits++;
}

digits gives the answer.

Zeeshan
  • 2,884
  • 3
  • 28
  • 47
  • 2
    This method will give incorrect results (off by one) for negative integers and this case of `0` it will count zero digits. – j b Oct 02 '14 at 10:18
0

A simple way to find the length (i.e number of digits) of signed integer is this:

while ( abs(n) > 9 )
{
    num /= 10;
    ++len;
}

Where n is the integer you want to find the length of and where len is equal to the number of digits in the integer. This works for both values of n (negative or positive).

The call on abs() is optional, if you are only working with positive integers.

-1

you can find number of digits in a number by using this formaula ceil (log10 (abs (x))) where ceil returns a integer number just greater than number

Anshul garg
  • 233
  • 2
  • 6
  • This fails at x=100 where it gives 2 digits instead of 3. `ceil(log10(x))` shouldn't be used, but rather `floor(log10(x)) + 1`. – Anic17 Jan 06 '23 at 19:10
-1
void main()
{     
    int a,i;
    printf("Enter the number :");       
    scanf("%d",&a);

    while(a>0)     
    {
        a=a/10;   
        i++;  
    }

    getch();
}
Ori Dar
  • 18,687
  • 5
  • 58
  • 72
Akshay
  • 1