1

I try to write a function to count number digits,and by the way,I try to compare the efficiency of the different way. 1.the lenstr(i) way:

def nDigits(i):
    return len(str(i))
for i in range(100000):
    print nDigits(i)

it takes about 143.75s

2.the log10 way:

import math
def nDigits(i):
    if i > 0:
        n = int(math.log10(i)) + 1
    elif i == 0:
        n = 1
    else:
        n = int(math.log10(-i)) + 2
    return n

for i in range(100000):
    print nDigits(i)

it takes about 144.35s

3.the division way:

def nDigits(i):
    t = 0
    while i > 0:
        t += 1
        i /= 10
    return t
for i in range(100000):
    print nDigits(i)

it takes about 143.43s

4.the division way in c:

#include<stdio.h>

int digits(int num){
    int i = 0;
    while (num > 0){
        i += 1;
        num /= 10;
    }
    return i;
} 

void main(){
    int i = 0;
    while (i < 100000){
        i += 1;
        printf("%d",digits(i));    
    }
}

it takes about 0.07s

Is the C is 2000 times better than python...or There is a better way for python to counting number digits. thx guys,plz help me.

Dan D.
  • 73,243
  • 15
  • 104
  • 123
Tom Heng
  • 35
  • 5
  • 1
    `elif i = 0` will produce a syntax error. – Blender Mar 11 '13 at 02:36
  • 1
    What version of Python are you using? How are you running it? Note that the printing is actually the thing that takes the most time here, not the calculation, at least for me. Have you tried `for i in xrange(100000): nDigits(i)` ? – cge Mar 11 '13 at 02:53
  • You neglected this approach: http://stackoverflow.com/a/3069580/1553090 - you can further improve that by arranging the `if` statements in a binary format. That way you have a maximum of about 3 comparisons (oh, the answer actually includes that at the end). – paddy Mar 11 '13 at 04:14

3 Answers3

1

That slow?? If you change for i in range(100000): to for i in xrange(100000):, it is much faster, at least on my computer ( 1 sec or 2 or 3).

I suspect the slowness is caused by your usage of range(100000)

xrange is more efficient because instead of generating a list of objects, it just generates one object at a time. You should favor it over range in this case.

EDIT: after @cge mentioned the problem, i tested your original code (using range(100000)) and it was also finished pretty fast, within a sec or two, so this may not be the cause of your problem (something fishy happened, which I couldn't see from the code you posted here), but I suggest you use xrange anyway.

zzk
  • 1,347
  • 9
  • 15
  • but for me ,I change the range() to xrange(),it's slower,takes 150.28s.my python is python 2.7.2. – Tom Heng Mar 11 '13 at 02:49
  • I'm not sure this is user643937's problem, actually. Running the division code on my computer (Python 2.7.3), *regardless* of whether I use range or xrange, never takes more than a few seconds. The C code takes around 0.025 seconds. xrange and range don't seem to give me significantly different times: I get around 1.08 seconds either way. I think there's something seriously wrong here with the Python installation or environment that user643937 is using. – cge Mar 11 '13 at 02:51
  • @cge could be.. I didn't test the original code, but just discovered using `xrange(10000)` resulted in much faster execution. In any case, he should have used `xrange` anyway.. – zzk Mar 11 '13 at 03:04
1

Simplify your test cases and remove all of those prints:

import math

def num_digits1(n):
    return len(str(n))

def num_digits2(n):
    return int(math.log10(n)) + 1

def num_digits3(n):
    t = 0

    while n:
        t += 1
        n /= 10

    return t

Here are my results:

>>> %timeit num_digits1(random.randint(1, 100000000))
100000 loops, best of 3: 1.64 us per loop
>>> %timeit num_digits2(random.randint(1, 100000000))
100000 loops, best of 3: 1.87 us per loop
>>> %timeit num_digits3(random.randint(1, 100000000))
100000 loops, best of 3: 2.49 us per loop
>>> %timeit random.randint(1, 100000000)
1000000 loops, best of 3: 1.29 us per loop

Subtracting the time it takes to generate a random number, I get:

num_digits1  0.35 us
num_digits2  0.58 us
num_digits3  1.20 us

And my C code comparison (which I hope is fair):

#include <stdlib.h>

int rand_int(int min, int max) {
    return min + (rand() / (double) RAND_MAX) / (max - min);
}

int num_digits(int num) {
    int i = 0;

    while (num > 0){
        i += 1;
        num /= 10;
    }

    return i;
} 

int main() {
    int i;

    for (i = 0; i < 10000000; i++) {
        num_digits(rand_int(1, 100000000));
    }

    return 0;
}

I run it:

$ gcc test.c -o test
$ time ./test./test
0.15s user 0.00s system 97% cpu 0.154 total

And my time is:

  0.154 s / 10,000,000
= 0.0154 us (0.0138 us with -O3)

The C code is about 23 times faster than the Python solution, which seems normal. Hopefully my C random number generator works.

With PyPy, I get 66.7 ns (not us) for num_digits1, which is only 4.3 times slower.

Blender
  • 289,723
  • 53
  • 439
  • 496
1

I think your bottleneck is the print statement. Try saving the results in a list instead.

def nDigits(i):
    return len(str(i))
results = []
for i in xrange(1000000):
    results.append(nDigits(i))
print len(results)

I used xrange instead of range and added an extra 0. It executes in 0.45 seconds on my machine.

Using a list comprehension gets the time down to 0.37 seconds.

def nDigits(i):
    return len(str(i))
results = [nDigits(i) for i in xrange(1000000)]
print len(results)

Removing the function call overhead gets the time down to 0.31 seconds.

results = [len(str(i)) for i in xrange(1000000)]
print len(results)
casevh
  • 11,093
  • 1
  • 24
  • 35
  • yes, I simply removed print in python and printf in c, reduce the time of python code to about 0.45s,and for the c, the result is 0.06s. c is much more efficiency in printf function. – Tom Heng Mar 11 '13 at 03:53