4

Trying to implement Boyer Moore, bad character matching algorithm without looking at the answers. I want to exit this loop I wrote with just one line of "No match" if there was no pattern found, here is my code:

T = 'ATGGGTATGGCTCGGCTCGG'
p = 'ATCG'

skip = 0

while skip + len(p) < len(T):

    string = T[skip:skip+len(p)]

    if string == p:
        print "offset at: ", skip
        break


    for i in range(len(string), 0, -1):


        if (string[i-1] != p[i-1]):

                if string[i-1] in p[:i-1]:
                    skip += (i - p.find(string[i-1]) - 1)
                    break

                elif not string[i-1] in p[:i-1]:
                    skip += i
                    break

Any hits regarding how to modify the code.

Thank you,

xp

edit: Scheme gave me the answer, as easy as that. I was so looping my head in the line string == p or string != p. Thank you for all your comments.

Xp.L
  • 837
  • 1
  • 7
  • 13
  • 1
    It sounds like a [while-else](https://stackoverflow.com/questions/3295938/else-clause-on-python-while-statement) might be useful here, but your description of what you want to do is ambiguous and missing too many details to tell what you're really trying to do. – user2357112 Jun 21 '17 at 22:57
  • 1
    You might want to see if [biopython](https://github.com/biopython/biopython.github.io/) or something similar has the functionality you want, or if Python's built-in string search algorithm is good enough for your use case. It looks like you were trying to implement an advanced string search algorithm (Boyer-Moore?), but you didn't do the preprocessing of Boyer-Moore, and things like copying on every string slice and Python interpreter overhead are going to slow things down too . – user2357112 Jun 21 '17 at 23:09
  • 1
    what about using a boolean variable before the loop `match = False`, and after/outside the loop `if not match: print "No match"` – chickity china chinese chicken Jun 21 '17 at 23:43

2 Answers2

1

I didn't quite get what you're trying to achieve here, but assuming you want to search for the substring within the base string, you can just use in as follows:

if p in T:
    print "Match"
else:
    print "No match"
Pratik K.
  • 147
  • 2
  • 13
  • I believe he is going for the standard find all the occurrences of x in y programming exercise rather than just a direct match. – axwr Jun 21 '17 at 23:01
1

If i am understanding you correctly you would like to exit your while loop and print "no match" only once, if the string p is not contained in t. This looks like the all too common DNA exercise.

Anyway a simple way to do what you want is to check for the substring p in t and break if it is not found. The answer here: How to determine whether a substring is in a different string would help with that.

Basically:

T = 'ATGGGTATGGCTCGGCTCGG'
p = 'ATCG'

skip = 0

while skip + len(p) < len(T):

    if not p in T:
        print "no match"
        break

    string = T[skip:skip+len(p)]

    if string == p:
        print "offset at: ", skip
        break


    for i in range(len(string), 0, -1):


        if (string[i-1] != p[i-1]):

                if string[i-1] in p[:i-1]:
                    skip += (i - p.find(string[i-1]) - 1)
                    break

                elif not string[i-1] in p[:i-1]:
                    skip += i
                    break

The key here is the line if not p in T: Which is basically saying If the string p does not exist inside the string T then do something. else, carry on.

On top of this you may want to consider looking at some good practices like variable naming, and using functions.

axwr
  • 2,118
  • 1
  • 16
  • 29
  • Thanks for the reply. I realized I was ambiguous about what I wanted. I was attempting to implement Boyer Moore for string matching. I know there are tons of answers out there, but I needed to try myself. So if there is a match in the sequence, this code would tell you where the match is at, but yes, as easy as that, your answer helps. Thank you. – Xp.L Jun 22 '17 at 04:23