1
  • Is there a methodology used in Python programming to decide safe test values? Something to make sure that accidental large values do not lead to risks.
  • I have read that Python discourages type checking. Should type-checking or bound-checking be done in such cases or are there alternatives?


I was working with this code and testing the running times. I accidentally entered a really large number and ran the code. I was able to stop it via task manager when it had reached 850MB RAM usage and going up. I don't want something like that to happen again.

def primes_list(num):
    ans = [2]
    for i in range(3, num, 2):

        temp = False
        for j in ans:
            if i % j == 0 or j*j > i:
                temp = True
                break

        if temp == False:
            ans.append(i)
    else:
        return ans
Wolph
  • 78,177
  • 11
  • 137
  • 148
Aseem Bansal
  • 6,722
  • 13
  • 46
  • 84
  • What do you mean by "lead to risk"? Does it mean too much memory consuming or others? – Roger May 24 '13 at 11:25
  • If you think your code has a limit, check for this limit. It is not about type checking, in any language you will have the same issue. – iurisilvio May 24 '13 at 11:27
  • I validate/typecheck/cast all the time in Python when I get values from external sources. – Fred Foo May 24 '13 at 11:27
  • @Roger For starters too much memory usage. That led to my laptop freezing before I was able to open task manager. – Aseem Bansal May 24 '13 at 11:28
  • @iurisilvio So should I do type checking in such cases? – Aseem Bansal May 24 '13 at 11:31
  • @Zel it is not type checking. The algorithm just do not work for large values. If you can't control which parameters were passed, make it fail before it is late. – iurisilvio May 24 '13 at 11:38

4 Answers4

1

If num is a really large number, you'd better use xrange instead of range. So change this line

for i in range(3, num, 2):

to

for i in xrange(3, num, 2):

This will save a lot of memory for you, for that range would pre-allocate the list in memory. When num is quite large, the list will occupy lots of memory.

And if you want to limit the memory usage, just check the num before doing any operation.

Roger
  • 440
  • 2
  • 13
  • This is useful. I am a Python beginner so didn't knew that. I hope you wouldn't mind if I start asking about other risks. What about user entering some other type like float, string etc.? Any suggestions about that? – Aseem Bansal May 24 '13 at 11:50
  • @Zel use function isinstance(num, int) to check whether it's integer – Roger May 24 '13 at 12:10
1

On linux bash there is a feature ulimit which enables the memory for any processes running to be restricted. Add this to your program, and put print num as the first line of your primes_list function

try:
    primes_list(10)
    primes_list(100)
    primes_list(1000)
    primes_list(10000)
    primes_list(100000)
    primes_list(1000000)
    primes_list(10000000)
    primes_list(100000000)
except MemoryError:
    print "out of memory"

Then you should be able to see the below feature working in the shell. Note that I start a new bash shell. Ending the shell will bring the ulimit set back to normal. It is possible to script this of course. In this example I have ulimited the virtual memory size to 100MB

$ bash
$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63064
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63064
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
$ ulimit -v 100000
$ python foo.py 
10
100
1000
10000
100000
1000000
10000000
out of memory
Vorsprung
  • 32,923
  • 5
  • 39
  • 63
1

Your precise problem was that you typed too large a value while testing your function at the command line. The solution here isn't to modify the function in any way, but to use automated testing.

At its simplest, automated testing just means writing another function that calls your function and makes sure it returns the right value. The computer does exactly what you've been doing at the command line. However, the automated approach is better because your test function is saved in a file - you don't need to type your test values at the command prompt every time. So you're basically immune to typing the wrong number and getting a memory overflow. There are lots of other advantages, too.

Python's standard library includes the unittest module, which is designed to help you organise and run your unit tests. More examples for unittest here. Alternatives include Nose and py.test, both of which are cross-compatible with unittest.


Example for your primes_list function:

import unittest

class TestPrimes(unittest.TestCase):
    def test_primes_list(self):
        pl = primes_list(11)  # call the function being tested
        wanted = [2,3,5,7,11]  # the result we expect
        self.AssertEqual(pl, wanted)

if __name__ == "__main__":
    unittest.main()

To prove that automated testing works, I've written a test that will fail because of a bug in your function. (hint: when the supplied maximum is a prime, it won't be included in the output)

Community
  • 1
  • 1
Benjamin Hodgson
  • 42,952
  • 15
  • 108
  • 157
0

Aren't you trying to list prime numbers? Don't know if this will help you, but still:

def primes(n): // n is the maximum number we want to check
    if n==2: return [2] // 2 is the lowest prime number
    elif n<2: return [] // there's no prime number below 2
    s=range(3,n+1,2) // range from 2 by 2 numbers, leave out even numbers
    mroot = n ** 0.5 // get's the root of maximum number
    half=(n+1)/2-1
    i=0
    m=3
    while m <= mroot: // no point in checking above the root of the highest number
            if s[i]:
                    j=(m*m-3)/2
                    s[j]=0
                    while j<half:
                            s[j]=0
                            j+=m
            i=i+1
            m=2*i+3
    return [2]+[x for x in s if x] // final array consists of 2 and the rest of primes

For more information about prime determination algorithms check out:

http://en.wikipedia.org/wiki/Primality_test

Which is the fastest algorithm to find prime numbers?

Community
  • 1
  • 1
Dropout
  • 13,653
  • 10
  • 56
  • 109
  • I ran it. This is definitely better. Can you explain what are you doing in this? – Aseem Bansal May 24 '13 at 12:23
  • I use a different algorithm for checking if the number is a prime. I narrow down the number of checked numbers. I'll comment in the answer. – Dropout May 24 '13 at 12:50