0

I am solving a coding question from leetcode in which we are supposed to Count the number of prime numbers less than a non-negative number, n.

This is my solution:

class Solution(object):

    def isPrime(self, Number):
        return 2 in [Number, 2**Number % Number]

    def countPrimes(self, n):
        """
        :type n: int
        :rtype: int
        """
        count_prime = 0
        prime_range = filter(lambda x: x % 2 != 0 and x % 3 != 0 and x % 5 != 0, xrange(n))

        for i in prime_range:
            if self.isPrime(i):
                count_prime += 1

        return count_prime

print Solution().countPrimes(499979)

While this works for smaller numbers, it is taking a lot of time for larger numbers and I am not sure where which part is consuming more time in my program? Here I am testing for numbers greater than 100 only.

Could anyone help me to find which part is taking more time

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
python
  • 4,403
  • 13
  • 56
  • 103
  • Btw, 341 is the smallest Fermat pseudo-prime. It's the least positive number which is composite but for which your `isPrime` function returns `True`. – Steve Jessop Jan 23 '16 at 18:10
  • 5
    You might want to look into the 3-argument form of Python's [pow function](https://docs.python.org/3.5/library/functions.html#pow). And read up on pseudoprimes, strong pseudoprimes, Carmichael numbers and the Sieve of Eratosthenes. – Mark Dickinson Jan 23 '16 at 18:19
  • 1
    1387, 1729, 2047, 2701, 2821, 3277, 4033, 4369, 4681, 5461, 6601, 7957, 8321, 8911 are more false positives for your isPrime – M4rtini Jan 23 '16 at 18:20
  • @M4rtini: you missed a few! See https://oeis.org/A001567 – Steve Jessop Jan 23 '16 at 18:24
  • @SteveJessop I used the `prime_range` of OP, for n = 10 000, so i would guess the ones i missed was already filtered out by that? – M4rtini Jan 23 '16 at 18:27
  • @M4rtini: yes, sounds right. 561 divides by 3, 645 by 5, 1105 by 5, 1905 by 5, 4371 by 3 to name the first few you didn't include. The filter also excludes 2, so in this context checking 2 against `Number` isn't necessary (but the primes 2, 3, 5 won't be counted by this code). – Steve Jessop Jan 23 '16 at 18:28
  • Thank you everyone for help! I have updated my solution, and it is working well :) – python Jan 23 '16 at 19:18
  • 1
    Note that in the Sieve of Eratosthenes implementation you don't need `isPrime` at all: replace `if self.isPrime(i):` with `if total_prime[i]:` (and remove the `total_prime[i] = True` line: it's redundant). – Mark Dickinson Jan 23 '16 at 20:20
  • 1
    Please do not add your answer to the post. Consider creating a new answer and explain the steps that you performed in that answer. I have rolled back the edit in this case. Take care the next time. Regards – Bhargav Rao Nov 10 '16 at 19:27

2 Answers2

2

Could anyone help me to find which part is taking more time

Why yes, Python can. In short Python comes bundled with cProfile a profiler for Python code. Running your module under it produces the following statistics:

jim@jim: python -m cProfile tt.py 
673
         6311 function calls in 0.016 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.016    0.016 tt.py:1(<module>)
        1    0.000    0.000    0.000    0.000 tt.py:1(Solution)
     4979    0.002    0.000    0.002    0.000 tt.py:12(<lambda>)
     1327    0.013    0.000    0.013    0.000 tt.py:3(isPrime)
        1    0.000    0.000    0.016    0.016 tt.py:6(countPrimes)
        1    0.001    0.001    0.003    0.003 {filter}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Where all these columns are described in the documentation. So, as you can see, most time is actually spent in function isPrime:

1327    0.013    0.000    0.013    0.000 tt.py:3(isPrime)

A total of 1327 calls took 0.013/0.016 total time. So yup, that's where the issue is.

Now, after you discover where the bottleneck is you can try to find ways to optimize. Most of the times you can achieve better performances by simply using more efficient algorithms along with suitable data structures.

If you want to fast faster fastest, consider looking at things like Cython or Numba. If those don't cut it write a pure C extension which will probably run a bit faster. If that doesn't cut it just leave give up because you're not happy with anything.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
  • Thanks for this! Shouldn't this `2 in [Number, 2**Number % Number]` should take constant time? Any idea why this function taking more time. – python Jan 23 '16 at 18:11
  • 1
    @python: The value `2**Number` has a number of digits proportional to `Number` (in fact, equal to `Number+1` in base 2) and therefore soon doesn't fit in any underlying fixed-width type used by Python. The upshot is that the `%` operation will take time roughly proportional to `Number`. – Steve Jessop Jan 23 '16 at 18:14
  • Indeed, the computation inside the list literal is messing with you in this case. This should make you question this particular way of tackling this issue and instead try and find a more efficient algorithm. – Dimitris Fasarakis Hilliard Jan 23 '16 at 18:22
0

Have you taken a look at eratosthenes sieve? Its a classic algorithm for finding prime numbers:

The Sieve of Eratosthenes is a simple algorithm that finds the prime numbers up to a given integer. Implement this algorithm, with the only allowed optimization that the outer loop can stop at the square root of the limit, and the inner loop may start at the square of the prime just found. That means especially that you shouldn't optimize by using pre-computed wheels, i.e. don't assume you need only to cross out odd numbers (wheel based on 2), numbers equal to 1 or 5 modulo 6 (wheel based on 2 and 3), or similar wheels based on low primes. http://rosettacode.org/wiki/Sieve_of_Eratosthenes

Sieve of Eratosthenes - Finding Primes Python

Community
  • 1
  • 1
bcollins
  • 3,379
  • 4
  • 19
  • 35
  • Thanks for this! But what is wrong with my program? It should run fast enough to count all the prime numbers less than `n` even for larger numbers. – python Jan 23 '16 at 18:09