0

I have been tasked with creating a binary search over a list of words, I have come up with 2 implementations (and clearly haven't put in a case where the word isn't found yet but that's not an issue yet), however when the list is narrowed down to the word I am looking for, my function does not finish, instead it keeps running until maximum recursive depth is exceeded.

I put in print and it clearly shows the word at dasList[mid], and shows this over and over again until it finally gives up.

def _bisect2(dasList, word):
    mid = int(len(dasList)/2)
    if word.lower() > dasList[mid].lower():
        return _bisect2(dasList[mid: len(dasList)], word)            
    if word.lower() < dasList[mid].lower():
        return _bisect2(dasList[0: mid], word)
    else:
        return mid

this is being called by

print(_bisect2(fileList, input('Please type a word')))

I am using the Python 3.0 interpretor. Any suggestions?

Levon
  • 138,105
  • 33
  • 200
  • 191
Tony
  • 23
  • 5

3 Answers3

1

Your implementation (almost) works for me (and doesn't show the behavior you described with my pre-sorted input). I assume you've sorted your input files? A slightly modified (working) example is posted below.

def _bisect2(dasList, word,lidx=0):
    mid = int(len(dasList)/2)
    if word.lower() > dasList[mid].lower():
        return _bisect2(dasList[mid:], word,lidx=lidx+mid)            
    elif word.lower() < dasList[mid].lower():
        return _bisect2(dasList[:mid], word,lidx=lidx)
    return lidx+mid

words=sorted(["one","two","three","four","five","twenty","foo"])
print (words)
print (_bisect2(words,'three'))

Note that you were returning the index in the last partial list (which will always be 0)...

mgilson
  • 300,191
  • 65
  • 633
  • 696
  • 1
    yes the list of words is already sorted, and yes you're right it would return a 0. What i did notice however is the word i input seems to append a space onto the end. why would that be? – Tony May 31 '12 at 01:40
  • 1
    Well, that'll do it ;) Then your word isn't in the list. Just `strip` your input and you should be OK. – mgilson May 31 '12 at 01:41
  • 1
    cool thanks. ^^ why would it append it? is it just the newline character being added after enter is pressed? – Tony May 31 '12 at 01:44
  • @Tony -- I'm not sure. try using `raw_input` instead of `input` -- since you want strings, that's probably safer anyway. – mgilson May 31 '12 at 01:49
  • @Tony The blank space at the end of your input you prompt for is odd and I can't reproduce that behavior with v 3.2.3 – Levon May 31 '12 at 01:56
  • 3
    by all accounts 'raw_input()' has been changed to input http://stackoverflow.com/questions/954834/how-do-i-use-raw-input-in-python-3-1 and after investigating '.strip()' I ran into '.rstrip()' which did the job as well. Thanks for your help! – Tony May 31 '12 at 02:00
  • @Tony -- Right, forgot you were using python 3. Sorry about that ;). – mgilson May 31 '12 at 02:01
  • @Levon yes it is confusing me a bit as well, and I can't see where or why it is happening. I might consider getting v3.2.3 after this debaccle – Tony May 31 '12 at 02:05
  • @mgilson all good, you atleast pointed me in the right direction – Tony May 31 '12 at 02:06
1

This works fine for me. Note that the index returned at the end will always be the index of the word in the minimal list, not the index of the original list.

See also that the > compare doesn't do a len over the list again, it just iterates to the end. Slice syntax allows you to leave off the last number if you're iterating to the end.

words = "The quick brown fox jumped over the lazy dog".split()

def bisect(words, word):
    mid = int(len(words)/2)
    if word.lower() > words[mid].lower():
        return bisect(words[mid:], word)
    elif word.lower() < words[mid].lower():
        return bisect(words[0:mid], word)
    return mid

words = sorted(words)
print bisect(words, 'dog')
Josh Smeaton
  • 47,939
  • 24
  • 129
  • 164
1

Why not use Python's bisect module?

Recipe from the docs:

def index(a, x):
    'Locate the leftmost value exactly equal to x'
    i = bisect_left(a, x)
    if i != len(a) and a[i] == x:
        return i
    raise ValueError

Example:

>>> a = ['alfred','edward','mary','susan','thomas','wilma']
>>> index(a, 'mary')
2
>>> index(a, 'martha')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 6, in index
ValueError
Steven Rumbalski
  • 44,786
  • 9
  • 89
  • 119