How to retrieve the first decimal digit of number efficiently

Question

One obvious solution is:

int n = 2134;
while(n > 9)
    n /= 10;

which takes linear time. Could we do any faster?

Is this any faster than linear time:

char s[100];
sprintf(s, "%d", n);
n = s[0]-'0';

Which are the other ways (efficiency is primary concern)?
I've seen this, except that I need to find only the first digit. (Also, I don't understand the answer).

Your second example probably should use `sprintf` rather than `scanf`? — Mats Petersson, Jun 30 '13 at 18:56
Formatted I/O is surely much (magnitudes) slower than division. But have you benchmarked it? Also, shouldn't that `sscanf()` really be `snprintf()` instead? — , Jun 30 '13 at 18:56
"I don't understand the answer" is not a good argument for posting a new question, unless that answer is actually bad. Which it isn't, it's more or less `digits = floor(log10(n)); firstDigits = n/10^digits;` and the requisite casting of `double`s to integral types. — millimoose, Jun 30 '13 at 18:57
Also, that answer is only constant time if the algorithm for calculating the decimal logarithm is. — millimoose, Jun 30 '13 at 18:58
Can you even come up with a **definition** of "first digit" that isn't linear in the logarithm of the number?! — Kerrek SB, Jun 30 '13 at 19:00
@H2CO3 This should be faster than normal I/O I suppose. Since it outputs to memory rather than stdout. — mohit, Jun 30 '13 at 19:02
@mohit You do realise that `sprintf` has to basically determine what *every* digit of the number is somehow. With the additional overhead of creating a string it can't possibly not be slower than determining only one. — millimoose, Jun 30 '13 at 19:03
@mohit And? It's not the file I/O that's slow (well, it's slow too) but the recognition of formatting patterns/specifiers, error checking, etc. in the routines. — , Jun 30 '13 at 19:04
You should remember that an algorithm's big-O running time is not the only determinant of how long an implementation takes to execute in the real world. — bames53, Jun 30 '13 at 19:14
One interesting thing about algorithms and complexity is that you can state different asymptotic complexities for the same exact algorithm. You say that the division approach is *linear*, or it could be considered *constant* if you consider that the number of digits is bound (in a platform where `sizeof(int)==4` there are at most 10 decimal digits). Just saying that *linear* might sound worse than it actually is... — David Rodríguez - dribeas, Jun 30 '13 at 19:18
In general, pure number crunch is really really fast compare to access memory(cache). even a simple operation such as `s[0]` which assume it access L1 cache is about a 32 bit floating point multiplication which is about 1-2 cycles. Integer division is worse but compiler can probably optimize it. — yngccc, Jun 30 '13 at 19:26

anatolyg · Accepted Answer · 2013-07-01T07:51:56.593

23

Some processors have instructions that calculate "how big" a number is, very quickly (see http://en.wikipedia.org/wiki/Leading_zero_count). This can be used to quickly choose a power of 10, and divide by it, instead of dividing by 10 repeatedly.

Suppose you are given a function clz that calculates the number of leading zero bits in a number's binary representation (0...32). Then, you can use a lookup table that gives the proper power of 10 for each number of leading zeros.

uint32_t powers_of_10[33] = {
    1000000000, 1000000000,
    100000000, 100000000, 100000000,
    10000000, 10000000, 10000000,
    1000000, 1000000, 1000000, 1000000,
    100000, 100000, 100000,
    10000, 10000, 10000,
    1000, 1000, 1000, 1000,
    100, 100, 100,
    10, 10, 10,
    1, 1, 1, 1, 1
};

int CalcFirstDecimalDigit(uint32_t x)
{
    int leading_zeros = clz(x);
    x /= powers_of_10[leading_zeros];
    if (x >= 10)
        return 1;
    else
        return x;
}

edited Jul 01 '13 at 07:51

answered Jun 30 '13 at 19:46

anatolyg

26,506
9
60
134

1

This looks like the fastest solution, for processors that support such an operation. – TonyK Jun 30 '13 at 21:13
Very small suggestion: use `uint32_t powers_of_10[33]`. – chux - Reinstate Monica Jul 01 '13 at 01:17
3

With all due respect - lookup tables are not fast. Typically they require RAM access, and at best it's in cache - which is still much slower than work in the registers. – einpoklum Jul 28 '16 at 16:41

score 14 · Answer 2 · answered Jun 30 '13 at 19:47

14

e.g. for in 32 bit unsigned:

Step 1: determine (by binary search) in which of the following intervals the value is:

0 .. 9
10 .. 99
100 .. 999
1000 .. 9999
10000 .. 99999
100000 .. 999999
1000000 .. 9999999
10000000 .. 99999999
100000000 .. 999999999
1000000000 .. 4294967295

takes max 4 compares

Step 2:

Than calculate the leading digit by one division.

answered Jun 30 '13 at 19:47

MrSmith42

9,961
6
38
49

You can hard-code the binary-search of *Step 1* to avoid a loop and improve the speed. – MrSmith42 Sep 27 '13 at 09:18

score 6 · Answer 3 · edited Jun 30 '13 at 19:20

6

I'm pretty sure that sprintf (as I assume it is) will be SIGNIFICANTLY slower. You could do some optimization to reduce the number of divide operations (which is one of the slowest instructions on nearly all processors).

So one could do something like this:

 while(n > 10000)
   n /= 1000;

 while(n >= 9)
   n /= 10;

That is, of course, if speed is really important.

edited Jun 30 '13 at 19:20

Casey

41,449
7
95
125

answered Jun 30 '13 at 19:00

Mats Petersson

126,704
14
140
227

It seems like this is the sort of scenario that [`libdivide`](http://libdivide.com/) was meant for. – millimoose Jun 30 '13 at 19:06
5

If speed is _that_ important, it might be faster to do a binary search on the powers of ten { 1, 10, 100,...} etc. After that you only need one division. You could hard-code your binary search as a series of tests and jumps, preferably in assembly language. – TonyK Jun 30 '13 at 19:07
1

Well, for a 32-bit, you'll do a max of 3 iterations in the first loop. For 64-bit numbers is gets more meaningful to reduce the number of divisions. – Mats Petersson Jun 30 '13 at 19:14
while(n >= 9) , for n=9 the output is 0 as 9/10=0 , rather than 9. It should be while(n>=10) . – Arjun Sunil Kumar Dec 17 '16 at 07:32

score 5 · Answer 4 · answered Jun 30 '13 at 18:59

5

Your second example should use sprintf. Anyway, it cannot be faster since the entire number is printed, thus all digits are searched.

The linked question/answer uses a property of logarithm: for a number of x digits, it's base 10 logarithm is between x and x+1. But, due to floating point errors this method doesn't really work properly in some cases. Also, take into consideration the fact that doing floating point is slower than doing integer arithmetic.

Thus, the simplest solution is also the faster.

answered Jun 30 '13 at 18:59

Mihai Maruseac

20,967
7
57
109

Well, that logarithm solution cannot possibly be faster, simply because it uses the logarithm function. That is much more expensive than any floating point arithmetic on modern processors. So, yes, that small loop is definitely faster than `sprintf` or `log()`, if not even faster than MrSmith42's answer. – cmaster - reinstate monica Jun 30 '13 at 20:24

score 4 · Answer 5 · answered Sep 29 '14 at 04:16

Here's a kind of variation on a binary search. Like a binary search it is O(log n). Whether it's faster will depend on how fast you can do integer division.

if (n >= 100000000)
    n /= 100000000
if (n >= 10000)
    n /= 10000
if (n >= 100)
    n /= 100
if (n >= 10)
    n /= 10

The method expands easily for integers with a larger range.

Gianluca Ghettini · Answer 6 · 2013-06-30T19:48:25.250

3

you can do it in O(1) constant time but at expense of a very large memory usage. It's the same old time/memory trade off.

You can create a lookup table of 2^31 entries (signed int), 4 bits per entry (with 4 bits you can encode the first digit 1-9 of the number in decimal representation).

then you can use you int to access the lookup table and get the first digit in O(1). the lookup table will take 2^31 * 4 bits -> 1024 Mbytes

it's the fastest way I can think of...

edited Jun 30 '13 at 19:48

answered Jun 30 '13 at 19:41

Gianluca Ghettini

11,129
19
93
159

3

-1 for memory bloat, which negates any performance benefit due to memory latency. – cmaster - reinstate monica Jun 30 '13 at 20:26
4

@cmaster The "memory bloat" is obvious and is shown to how to get to O(1). This approach is quite practical in a subsets of cases. Say the int range was to be limited 0...1000. Such a table size would be practicable. Often when high performance routines are needed, there exists such limitations. I do not expect the OP to list all conditions - just the main ones. This approach, though not useful for a GP library, may be just the ticket for the OP or future reviewer. – chux - Reinstate Monica Jul 01 '13 at 01:31
1

Right, I just wanted to show that one can trade time with memory quite easily – Gianluca Ghettini Jul 01 '13 at 09:04
Plus, the naive approach using reiterated divisions is O(1) in space. @chux: what's OP? what's GP? – Gianluca Ghettini Jul 01 '13 at 09:21
@chux OP = [Original Poster](http://www.urbandictionary.com/define.php?term=op); GP = general-purpose (presumably) – anatolyg Jul 01 '13 at 09:26
4

I know about the possibilities to trade memory for performance, but this is not the place to do it. As other answers note, you can find the correct answer in four steps for the full integer range without touching L1 cache, which is certainly faster than waiting for the memory subsystem to deliver the answer. Going to smaller number ranges actually weakens your argument since it takes only two steps for the range up to 9999. This is not the sort of problem one should throw memory at. – cmaster - reinstate monica Jul 01 '13 at 16:14

score 3 · Answer 7 · answered Jul 01 '13 at 08:00

3

You Can Do simply This :

//Shashank Jain
#include<iostream>
#include<cmath>
using namespace std;
int main()
{
    int num,fdigit;
    cin>>num;
    if(num<0)
        num*=-1;
    int l=log10(num); // l = (length of number -1)

    fdigit=num/pow(10,l);

    cout<<fdigit<<endl;
    return 0;
}

answered Jul 01 '13 at 08:00

S J

3,210
1
22
32

Thanks. But it doesn't really works sometimes. See above answer by Mihai Maruseac. – mohit Jul 01 '13 at 10:05

score 1 · Answer 8 · edited Feb 28 '16 at 18:46

1

int FirstDigit ( int Number ) {

    // to obtain the <number of digits -1> use this math formula:

    int DigitsInNumber = (int) lg(Number);           

    long TenExpon = pow(10, DigitsInNumber);

    return (Number / TenExpon);                //first digit
}

also: lg(n) = ln(n) / ln(10);

edited Feb 28 '16 at 18:46

greybeard

2,249
8
30
66

answered Jul 09 '15 at 18:15

Dumitriu Sebastian

162
1
8

1

Floating-point calculations are not precise; rounding errors may cause incorrect results for powers of 10 (e.g. `10000` produces an incorrect result on my machine). Fortunately, it's easy to fix. – anatolyg Feb 28 '16 at 13:54
1

(Renaming your variable due to _edit not accepted_ with just the formatting "improvement") – greybeard Feb 28 '16 at 17:45

score 0 · Answer 9 · answered Jun 30 '13 at 19:00

Your first solution (assuming that n is known to be >= 0) is nearly optimal, and I think it could only be substantially improved by using in-line assembly language. But that would only be worthwhile if you were processing millions of such numbers.

Your second solution is -- how can I put this nicely? -- more of a Java-ish approach: Performance? La-di-da, who cares...

score 0 · Answer 10 · answered Oct 26 '18 at 07:47

    for(int i=0; i<n; i++)
    {  
        e=5; //Specify the number of digits in the number OR Exponential value of 10
        while(e>=0)
        {   
            tenp=pow(10,e); //#include <math.h>
            if(arr[i]/tenp!=0)
            {
                q[i][e]=arr[i]/tenp%10;
            }
            e--;
        }
    }

However, this code's complexity shall be O(n^2), which in undesirable.

score 0 · Answer 11 · answered Jul 12 '19 at 10:33

0

Another solution: store all your values in BCD format, big endian. Access the first nibble to get the first digit

Es.

Decimal value: 265
BCD format: 0010-0110-0101

answered Jul 12 '19 at 10:33

Gianluca Ghettini

11,129
19
93
159

score -2 · Answer 12 · answered Jun 30 '13 at 19:49

-2

if you number is x9x8x7x6x5x4x3x2x1 then you just divide by 10^8 so you need: The best way to find how many digit number have. You can use Binary search/

answered Jun 30 '13 at 19:49

G.w. Weil

21
1
6

RedKid · Answer 13 · 2018-09-24T16:04:18.643

First make a double variable in which the number is saved. Then divide that number by 10 in a loop to make the number continuously lose a digit, until it is a 1-digit number. cast the double variable to int to round it down. The result will then be the previously first decimal digit.

The code will run in O(log(n)).

#include <iostream>

using namespace std;

int main() {
    double a;
    cin >> a;
    while (true) {
        a /= 10;
        if (a < 10) {
            a = int(a);
            cout << a;
            break;
        }
    }
    return 0; 
}

How to retrieve the first decimal digit of number efficiently

13 Answers13

Linked

Related