10

I'm trying to do a simple primality test in Python.

Accoding to Wikipedia, a primality test is the following:

Given an input number n, check whether any integer m from 2 to n − 1 divides n. If n is divisible by any m then n is composite, otherwise it is prime.

I started with ruling out the even numbers - with the exception of 2 - as candidates to prime

def prime_candidates(x):
    odd = range(1, x, 2)
    odd.insert(0, 2)
    odd.remove(1)
    return odd

Then writing a function to check for primes, according to the rules above.

def isprime(x):
    for i in range(2, x-1):
            if x % i == 0:
                    return False
            else:
                    return True

And this is the main function, which iterates over a list of 8000 prime candidates and tests their primality

def main():
    end = 8000
    candidates = prime_candidates(end)
    for i in candidates:
            if isprime(i) and i < end:
                    print 'prime found ' + str(i)

The problem is that the isprime function returns True for numbers that aren't primes.

johnsyweb
  • 136,902
  • 23
  • 188
  • 247
Mahmoud Hanafy
  • 7,958
  • 12
  • 47
  • 62
  • `candidates` could just be `[2] + range(3, end, 2)`. I don't know why you include zero. And you don't need to test `and i < end`; that's guaranteed by `range(..., end, ...)`. – Marcelo Cantos Nov 05 '11 at 09:44
  • 2
    Note that you don't have to check up to `n-1`. It's enough to check up to `sqrt(n)` as anything higher than that will surely not divide it or has already been checked by a number lower than the square root. – Paul Manta Nov 05 '11 at 09:45
  • 1
    Also note that all canidate generation approaches mentioned make an unnecessary copy. `xs = range(1, x, 2); xs[0] = 2` makes not copies, and with `xrange` (plain `range` in 3.x) and `itertools.chain` one can even avoid storing more than one canidate at a time. –  Nov 05 '11 at 09:48
  • @MarceloCantos candidates doesn't include zero, it starts from 1, I'm just inserting 2 at the beginning. – Mahmoud Hanafy Nov 05 '11 at 09:52
  • 1
    @PaulManta `range` doesn't accept float, which `sqrt(n)` returns, and casting to int messes up the result due to rounding error. – Mahmoud Hanafy Nov 05 '11 at 09:53
  • @MahmoudHossam: Ah, silly me. I just read (0, 2) incorrectly. – Marcelo Cantos Nov 05 '11 at 09:54
  • @MahmoudHossam: RE: sqrt. Casting to int is not a problem, since the next integer above the square root is guaranteed not to divide x. Doing so also takes your cost from O(N) to O(sqrt(N)), which is well worth it. – Marcelo Cantos Nov 05 '11 at 09:55
  • @MarceloCantos I tried doing it, but it yields a different result, so I think it does make a difference. – Mahmoud Hanafy Nov 05 '11 at 10:15
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/4745/discussion-between-mahmoud-hossam-and-marcelo-cantos) – Mahmoud Hanafy Nov 05 '11 at 10:18
  • 1
    Related: [Does a library for prime-related functions exist for Python?](https://stackoverflow.com/q/10974805/427158) – maxschlepzig Sep 15 '18 at 13:16

4 Answers4

14

Have a look at the Miller–Rabin primality test if a probabilistic algorithm will suffice. You could also prove a number to be prime, with for instance Elliptic Curve Primality Proving (ECPP), but it takes more effort.

A simple trial division algorithm is the following

def prime(a):
     return not (a < 2 or any(a % x == 0 for x in range(2, int(a ** 0.5) + 1)))

Edit: Here's a more educational version because the first solution is very condensed and perhaps harder to read:

from math import sqrt
def prime(a):
    if a < 2: return False
    for x in range(2, int(sqrt(a)) + 1):
        if a % x == 0:
            return False
    return True

I've substituted in sqrt(a) in place of a ** 0.5 to make things more clear. The square root is used to not look at more factors than we have to.

Morten Kristensen
  • 7,412
  • 4
  • 32
  • 52
  • 2
    The problem says I have to use trial division. – Mahmoud Hanafy Nov 05 '11 at 09:42
  • 2
    This is homework. The exercise is a means to teach programming! – David Heffernan Nov 05 '11 at 09:43
  • @David: I don't understand your point. Clearly Mahmoud gave an effort... he supplied his entire code. Asking this community is like asking the TA for help, right? I completely agree we should not simply provide answers, but providing help in debugging homework seems wonderfully community-minded. – Mike Williamson Dec 06 '12 at 01:02
  • @Mike The comment is directed at this answer. Which proposes some elaborate algo that is very off topic here. – David Heffernan Dec 06 '12 at 07:25
  • 1
    @David: Ah, gotcha, my bad. Regardless, while the citations might be a bit much, I still like Morten's algo. It is succinct, which is hard to learn at first, but a great skill to develop. He also returns the evaluation instead of a separate T/F statement, also great to get the head around. And he hints that a "trial division" is not really that efficient. – Mike Williamson Dec 13 '12 at 17:52
  • 2
    @MikeWilliamson The succinctness here is very bad. Much better is to use a few more lines and add some explanatory variables. Excessive succinctness is a terrible failing of many programmers. If this was spread out properly on 4 or 5 lines it would be much quicker to read. Much easier to read. Much easier to check. For a beginner, like the person asking the question, the code in this answer is black magic. – David Heffernan Dec 13 '12 at 17:57
  • I like the idea of use ```any``` statement instad of ```all```. It changes a full list comprehension for a search. But I'm not sure if once found the first divisor x the operation would break and return ```False```, since `range` returns all the values at once. Is this already taken care of, or should `xrange` instead? – Sik May 13 '15 at 00:29
  • @Sik: In python 2.x I would use `xrange` because it acts lazily as a generator and doesn't allocate the full list as `range` does. In python 3.x there is no `xrange`. However, in this case the result is the same because the range is run through. So using `any` instead of `all` simply makes it stop comparing values when a first `True` is found and therefore `all` would make the algorithm fail. – Morten Kristensen May 13 '15 at 12:04
  • @MortenKristensen: My consideration was not towards using `all` or `any` to achieve the correct answer. Both would do, if the condition is suited. My concerns were if using `any`+`xrange` in python 2.x would stop after the first true is found. Which I guess it is, and `any`+`range` in python 3.x should also do (since P3.x `range` is P2.x `xrange`). If the return had to parse the whole `-range` output no matter what due to how the expression has been called, it might be a better option to create a generative function and call it here. – Sik May 13 '15 at 13:20
  • I am not sure if someone has already told you, but your `prime` algorithm is far from being the optimal or optimised one, i.e. it is quite bad in terms of performance. For example, you are checking all even numbers, but primes, except 2, are a subset of odd numbers. There's also no need to write an algorithm in one line just to seem to be a geek, it's not helpful, and, in my opinion, is one of the worst things that Python provides. I am pretty sure that in terms of performance these syntax-sugar constructs provided by Python are not useful. – nbro Jan 19 '16 at 13:47
  • Trying not to divide so many times, you can avoid n**0.5 / 2 divisions: – juanfal Jan 06 '23 at 19:03
12

In brief, your isprime(x) checks whether the number is odd, exiting right after if x % 2 == 0.

Try a small change so that you would actually iterate:

def isprime(x):
    for i in range(2, x-1):
        if x % i == 0:
            return False
    else:
        return True

Note that else: is now part of the for loop rather than if statement.

alf
  • 8,377
  • 24
  • 45
  • While this solution is completely fine, the solution below with the "while" loop over d * d <= n is going to be markedly faster. This is not the optimal solution, IMHO. – Mike Williamson Dec 06 '12 at 01:00
  • Given that it's a homework, I did not assume that the performance mattered. The question itself was "what's wrong," there were no question on how would one make it better. – alf Dec 06 '12 at 16:58
  • apologies... didn't mean to criticize your solution. You're correct. I was just wondering why the community was voting this solution over the one that only goes up to sqrt(n). – Mike Williamson Dec 13 '12 at 17:55
  • 3
    Any reason to put the second return in an `else` block? The `for/else` construct is rather uncommon, and unneeded here given that the loop does not contain a `break` statement. – Clément Nov 09 '13 at 18:42
  • That's 2 years ago. I guess the reason was to keep the change to the minimum. – alf Nov 10 '13 at 19:14
2

Your function actually returns whether your number is odd.

Indeed, what you're doing is you're checking whether 2 divides your number, and return immediately. You never check for the other numbers.

What you need to do is take this return true out of the if's else clause and the for loop back into the main function body.

On a sidenote, If you're looking for the primes lower than a given number, you could store the primes you found in memory and then only try dividing you new number by those primes ! (because if d is composite and divides q, then p exists such that p is prime and p divides q).

Thomas Orozco
  • 53,284
  • 11
  • 113
  • 116
2

The problem is that you put return False in the else clause rather than at the end of the function. So your function will return right after the first divisor is checked, rather than going on to check other divisors.

Here is a simple primality test similar to yours:

def is_prime(n):
    d = 2
    while d * d <= n:
        if n % d == 0:
            return False
        d += 1
    return n > 1
Daniel Lubarov
  • 7,796
  • 1
  • 37
  • 56
  • (I didn't downvote, but...) Computing the square root once is bound to be cheaper than squaring n times. Also, while loops aren't very Pythonic; use range(). – Marcelo Cantos Nov 05 '11 at 09:50
  • I agree range is more pythonic, but OTOH integer multiplication is cheap, and the sqrt approach requires rounding up, which seems inelegant to me. – Daniel Lubarov Nov 05 '11 at 09:57
  • The next integer above the square root doesn't divide n. It is therefore safe to round down, which is as simple as casting to int. Besides, I'll favour O(sqrt(N)) over O(N) any day of the week, elegance be damned (within reason, of course). – Marcelo Cantos Nov 05 '11 at 10:01
  • You're right, a cast to int works fine. But this code is also O(sqrt(n)), and probably at least as fast since it doesn't have the overhead of list lookups (for python 2) or iterators (for 3). – Daniel Lubarov Nov 05 '11 at 10:06
  • Duuuhh. My brain's half-on, half-off tonight. – Marcelo Cantos Nov 05 '11 at 10:19