0

So, I've just started following coursera.org's new algorithm course. Since the course is in JAVA, and I don't want to learn Java+algorithms at the same time, I'm "translating" the JAVA samples to python. However, I'm a bit stuck, since the algorithms which should be quicker are performing worse. The weird bit to me, is that with a big input when I run the tests, the slower algorithm is the fastest,... and I think this has to do with something weird that's happening with the array (ids) that I instantiate the different objects with:

import time
from utils.benchmark import *
from quickunion import *
from quickunion_weighted import *
from quickfind import *

# create only one array of id's so the comparison is fair
ids = random_tree(10)

my_trees = [QuickFindUF, QuickUnionUF, 
        QuickUnionWeighted, QuickUnionPathCompression]

print "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"

def test(classes, tree):
    for e in classes:
        tmp = e(arr=tree)
        print tmp.id
        print "%s:" % tmp.__class__.__name__
        t = time.clock()
        print "\tare 3 and 6 connected?: %s" % tmp.connected(3, 6)
        "\tunion(3, 6): "
        tmp.union(3,6)
        print "\tare 3 and 6 connected?: %s" % tmp.connected(3, 6)
        print "Total time: {0} ".format(time.clock()-t)

if __name__ == '__main__':
    test(my_trees, ids)

This prints the following results:

[1, 8, 1, 7, 4, 8, 5, 7, 8, 2]
QuickFindUF:
    are 3 and 6 connected?: False
    are 3 and 6 connected?: True
Total time: 2.7e-05 
[1, 8, 1, 5, 4, 8, 5, 5, 8, 2]
QuickUnionUF:
    are 3 and 6 connected?: True
    are 3 and 6 connected?: True
Total time: 2.6e-05 
[1, 8, 1, 5, 4, 8, 5, 5, 8, 2]
QuickUnionWeighted:
    are 3 and 6 connected?: True
    are 3 and 6 connected?: True
Total time: 2.8e-05 
[1, 8, 1, 5, 4, 8, 5, 5, 8, 2]
QuickUnionPathCompression:
    are 3 and 6 connected?: True
    are 3 and 6 connected?: True
Total time: 2.7e-05

For some reason, the arrays are getting changed before the comparison in all but the QuickFindUF instance. Any ideas why?

This is the repo I've created: https://github.com/herrmendez/python-algorithms

Wooble
  • 87,717
  • 12
  • 108
  • 131
la_f0ka
  • 1,773
  • 3
  • 23
  • 44
  • (1) `timeit` beats `time` for small benchmarks. (2) An algorithm being asymptotically faster doesn't necessarily mean that your implementation of it is actually faster for tiny problem sizes (and let's face it, 10 *is* tiny; even linear search could be faster for 10 items). –  Feb 11 '13 at 16:01
  • This is a great example of why you should never use `from foo import *`; this is incredibly annoying to try to track down where your functions are coming from. – Wooble Feb 11 '13 at 16:01

1 Answers1

0

Your algorithm is modifying the list instance passed in. Python does not have the concept of taking arguments by copyable value; every bound name is an object reference passed by value, but some object types are immutable.

Pass the algorithm a copy of the list:

from copy import deepcopy

...

    tmp = e(arr=deepcopy(tree))
ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • 1
    Nothing is passed by reference in Python. There are those things called "references" that are passed around, but that's a different thing altogether. –  Feb 11 '13 at 16:04
  • you are assuming that tree can be copied by [:]? – cmd Feb 11 '13 at 16:10
  • @cmd ah, thought it was a list. Will fix. – ecatmur Feb 11 '13 at 16:11
  • @delnan sorry, it's easy to get wrong. I think I've found the best duplicate to close this question. – ecatmur Feb 11 '13 at 16:12
  • @delnam there are 3 ways to pass something: value, pointer, reference. I'd like to hear your arguments of how python is not passing by reference. – cmd Feb 11 '13 at 16:12
  • @cmd in most pass-by-reference languages, assigning `arg = newval` within a called function modifies `arg` in the scope of the caller. In Python it just rebinds in the callee's scope. I've heard Python's style called "pass-by-object". – ecatmur Feb 11 '13 at 16:15
  • @cmd That's hardly the only ways of passing arguments, you may want to consult Wikipedia (for example, there is call-by-name, call-by-sharing, and call-by-need). In any case, unless you use a nonstandard definition of "pass by reference", passing by reference implies that `def zero(x): x = 0` changes something `my_variable` when called as `zero(my_variable)`. In Python it doesn't, so it's not pass by reference. –  Feb 11 '13 at 16:15
  • See http://stackoverflow.com/questions/10262920/understanding-pythons-call-by-object-style-of-passing-function-arguments – ecatmur Feb 11 '13 at 16:16
  • @delnan I would argue that it passes by reference, but the assignment doesn't work how other languages do. Assignments point the name to a new reference. – cmd Feb 11 '13 at 17:02
  • @cmd Then you are ignoring standard terminology. I'm sick of this topic, I and others have discussed it a thousand times already, so I'll leave it at that. –  Feb 11 '13 at 17:15