14

One obvious solution is:

int n = 2134;
while(n > 9)
    n /= 10;

which takes linear time. Could we do any faster?

Is this any faster than linear time:

char s[100];
sprintf(s, "%d", n);
n = s[0]-'0';

Which are the other ways (efficiency is primary concern)?
I've seen this, except that I need to find only the first digit. (Also, I don't understand the answer).

Community
  • 1
  • 1
mohit
  • 5,696
  • 3
  • 24
  • 37
  • 1
    Your second example probably should use `sprintf` rather than `scanf`? – Mats Petersson Jun 30 '13 at 18:56
  • 4
    Formatted I/O is surely much (magnitudes) slower than division. But have you benchmarked it? Also, shouldn't that `sscanf()` really be `snprintf()` instead? –  Jun 30 '13 at 18:56
  • 4
    "I don't understand the answer" is not a good argument for posting a new question, unless that answer is actually bad. Which it isn't, it's more or less `digits = floor(log10(n)); firstDigits = n/10^digits;` and the requisite casting of `double`s to integral types. – millimoose Jun 30 '13 at 18:57
  • Also, that answer is only constant time if the algorithm for calculating the decimal logarithm is. – millimoose Jun 30 '13 at 18:58
  • Sorry, for the mistake. Corrected. – mohit Jun 30 '13 at 18:58
  • 3
    Can you even come up with a **definition** of "first digit" that isn't linear in the logarithm of the number?! – Kerrek SB Jun 30 '13 at 19:00
  • @H2CO3 This should be faster than normal I/O I suppose. Since it outputs to memory rather than stdout. – mohit Jun 30 '13 at 19:02
  • @mohit You do realise that `sprintf` has to basically determine what *every* digit of the number is somehow. With the additional overhead of creating a string it can't possibly not be slower than determining only one. – millimoose Jun 30 '13 at 19:03
  • @mohit And? It's not the file I/O that's slow (well, it's slow too) but the recognition of formatting patterns/specifiers, error checking, etc. in the routines. –  Jun 30 '13 at 19:04
  • You should remember that an algorithm's big-O running time is not the only determinant of how long an implementation takes to execute in the real world. – bames53 Jun 30 '13 at 19:14
  • 5
    One interesting thing about algorithms and complexity is that you can state different asymptotic complexities for the same exact algorithm. You say that the division approach is *linear*, or it could be considered *constant* if you consider that the number of digits is bound (in a platform where `sizeof(int)==4` there are at most 10 decimal digits). Just saying that *linear* might sound worse than it actually is... – David Rodríguez - dribeas Jun 30 '13 at 19:18
  • 1
    In general, pure number crunch is really really fast compare to access memory(cache). even a simple operation such as `s[0]` which assume it access L1 cache is about a 32 bit floating point multiplication which is about 1-2 cycles. Integer division is worse but compiler can probably optimize it. – yngccc Jun 30 '13 at 19:26

13 Answers13

23

Some processors have instructions that calculate "how big" a number is, very quickly (see http://en.wikipedia.org/wiki/Leading_zero_count). This can be used to quickly choose a power of 10, and divide by it, instead of dividing by 10 repeatedly.

Suppose you are given a function clz that calculates the number of leading zero bits in a number's binary representation (0...32). Then, you can use a lookup table that gives the proper power of 10 for each number of leading zeros.

uint32_t powers_of_10[33] = {
    1000000000, 1000000000,
    100000000, 100000000, 100000000,
    10000000, 10000000, 10000000,
    1000000, 1000000, 1000000, 1000000,
    100000, 100000, 100000,
    10000, 10000, 10000,
    1000, 1000, 1000, 1000,
    100, 100, 100,
    10, 10, 10,
    1, 1, 1, 1, 1
};

int CalcFirstDecimalDigit(uint32_t x)
{
    int leading_zeros = clz(x);
    x /= powers_of_10[leading_zeros];
    if (x >= 10)
        return 1;
    else
        return x;
}
anatolyg
  • 26,506
  • 9
  • 60
  • 134
14

e.g. for in 32 bit unsigned:

Step 1: determine (by binary search) in which of the following intervals the value is:

0 .. 9
10 .. 99
100 .. 999
1000 .. 9999
10000 .. 99999
100000 .. 999999
1000000 .. 9999999
10000000 .. 99999999
100000000 .. 999999999
1000000000 .. 4294967295

takes max 4 compares

Step 2:

Than calculate the leading digit by one division.

MrSmith42
  • 9,961
  • 6
  • 38
  • 49
6

I'm pretty sure that sprintf (as I assume it is) will be SIGNIFICANTLY slower. You could do some optimization to reduce the number of divide operations (which is one of the slowest instructions on nearly all processors).

So one could do something like this:

 while(n > 10000)
   n /= 1000;

 while(n >= 9)
   n /= 10;

That is, of course, if speed is really important.

Casey
  • 41,449
  • 7
  • 95
  • 125
Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • It seems like this is the sort of scenario that [`libdivide`](http://libdivide.com/) was meant for. – millimoose Jun 30 '13 at 19:06
  • 5
    If speed is _that_ important, it might be faster to do a binary search on the powers of ten { 1, 10, 100,...} etc. After that you only need one division. You could hard-code your binary search as a series of tests and jumps, preferably in assembly language. – TonyK Jun 30 '13 at 19:07
  • 1
    Well, for a 32-bit, you'll do a max of 3 iterations in the first loop. For 64-bit numbers is gets more meaningful to reduce the number of divisions. – Mats Petersson Jun 30 '13 at 19:14
  • while(n >= 9) , for n=9 the output is 0 as 9/10=0 , rather than 9. It should be while(n>=10) . – Arjun Sunil Kumar Dec 17 '16 at 07:32
5

Your second example should use sprintf. Anyway, it cannot be faster since the entire number is printed, thus all digits are searched.

The linked question/answer uses a property of logarithm: for a number of x digits, it's base 10 logarithm is between x and x+1. But, due to floating point errors this method doesn't really work properly in some cases. Also, take into consideration the fact that doing floating point is slower than doing integer arithmetic.

Thus, the simplest solution is also the faster.

Mihai Maruseac
  • 20,967
  • 7
  • 57
  • 109
  • Well, that logarithm solution cannot possibly be faster, simply because it uses the logarithm function. That is much more expensive than any floating point arithmetic on modern processors. So, yes, that small loop is definitely faster than `sprintf` or `log()`, if not even faster than MrSmith42's answer. – cmaster - reinstate monica Jun 30 '13 at 20:24
4

Here's a kind of variation on a binary search. Like a binary search it is O(log n). Whether it's faster will depend on how fast you can do integer division.

if (n >= 100000000)
    n /= 100000000
if (n >= 10000)
    n /= 10000
if (n >= 100)
    n /= 100
if (n >= 10)
    n /= 10

The method expands easily for integers with a larger range.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
3

you can do it in O(1) constant time but at expense of a very large memory usage. It's the same old time/memory trade off.

You can create a lookup table of 2^31 entries (signed int), 4 bits per entry (with 4 bits you can encode the first digit 1-9 of the number in decimal representation).

then you can use you int to access the lookup table and get the first digit in O(1). the lookup table will take 2^31 * 4 bits -> 1024 Mbytes

it's the fastest way I can think of...

Gianluca Ghettini
  • 11,129
  • 19
  • 93
  • 159
  • 3
    -1 for memory bloat, which negates any performance benefit due to memory latency. – cmaster - reinstate monica Jun 30 '13 at 20:26
  • 4
    @cmaster The "memory bloat" is obvious and is shown to how to get to O(1). This approach is quite practical in a subsets of cases. Say the int range was to be limited 0...1000. Such a table size would be practicable. Often when high performance routines are needed, there exists such limitations. I do not expect the OP to list all conditions - just the main ones. This approach, though not useful for a GP library, may be just the ticket for the OP or future reviewer. – chux - Reinstate Monica Jul 01 '13 at 01:31
  • 1
    Right, I just wanted to show that one can trade time with memory quite easily – Gianluca Ghettini Jul 01 '13 at 09:04
  • Plus, the naive approach using reiterated divisions is O(1) in space. @chux: what's OP? what's GP? – Gianluca Ghettini Jul 01 '13 at 09:21
  • @chux OP = [Original Poster](http://www.urbandictionary.com/define.php?term=op); GP = general-purpose (presumably) – anatolyg Jul 01 '13 at 09:26
  • 4
    I know about the possibilities to trade memory for performance, but this is not the place to do it. As other answers note, you can find the correct answer in four steps for the full integer range without touching L1 cache, which is certainly faster than waiting for the memory subsystem to deliver the answer. Going to smaller number ranges actually weakens your argument since it takes only two steps for the range up to 9999. This is not the sort of problem one should throw memory at. – cmaster - reinstate monica Jul 01 '13 at 16:14
3

You Can Do simply This :

//Shashank Jain
#include<iostream>
#include<cmath>
using namespace std;
int main()
{
    int num,fdigit;
    cin>>num;
    if(num<0)
        num*=-1;
    int l=log10(num); // l = (length of number -1)

    fdigit=num/pow(10,l);

    cout<<fdigit<<endl;
    return 0;
}
S J
  • 3,210
  • 1
  • 22
  • 32
1
int FirstDigit ( int Number ) {

    // to obtain the <number of digits -1> use this math formula:

    int DigitsInNumber = (int) lg(Number);           

    long TenExpon = pow(10, DigitsInNumber);

    return (Number / TenExpon);                //first digit
}

also: lg(n) = ln(n) / ln(10);

greybeard
  • 2,249
  • 8
  • 30
  • 66
  • 1
    Floating-point calculations are not precise; rounding errors may cause incorrect results for powers of 10 (e.g. `10000` produces an incorrect result on my machine). Fortunately, it's easy to fix. – anatolyg Feb 28 '16 at 13:54
  • 1
    (Renaming your variable due to _edit not accepted_ with just the formatting "improvement") – greybeard Feb 28 '16 at 17:45
0

Your first solution (assuming that n is known to be >= 0) is nearly optimal, and I think it could only be substantially improved by using in-line assembly language. But that would only be worthwhile if you were processing millions of such numbers.

Your second solution is -- how can I put this nicely? -- more of a Java-ish approach: Performance? La-di-da, who cares...

TonyK
  • 16,761
  • 4
  • 37
  • 72
0

    for(int i=0; i<n; i++)
    {  
        e=5; //Specify the number of digits in the number OR Exponential value of 10
        while(e>=0)
        {   
            tenp=pow(10,e); //#include <math.h>
            if(arr[i]/tenp!=0)
            {
                q[i][e]=arr[i]/tenp%10;
            }
            e--;
        }
    }

However, this code's complexity shall be O(n^2), which in undesirable.

Afrah
  • 1
0

Another solution: store all your values in BCD format, big endian. Access the first nibble to get the first digit

Es.

Decimal value: 265
BCD format: 0010-0110-0101
Gianluca Ghettini
  • 11,129
  • 19
  • 93
  • 159
-2

if you number is x9x8x7x6x5x4x3x2x1 then you just divide by 10^8 so you need: The best way to find how many digit number have. You can use Binary search/

G.w. Weil
  • 21
  • 1
  • 6
-2

First make a double variable in which the number is saved. Then divide that number by 10 in a loop to make the number continuously lose a digit, until it is a 1-digit number. cast the double variable to int to round it down. The result will then be the previously first decimal digit.

The code will run in O(log(n)).

#include <iostream>

using namespace std;

int main() {
    double a;
    cin >> a;
    while (true) {
        a /= 10;
        if (a < 10) {
            a = int(a);
            cout << a;
            break;
        }
    }
    return 0; 
}
RedKid
  • 1
  • 1