0

I am writing a program that analyzes and parses consensus sequences. I have successfully gotten the analyzing and parsing to work and the program tells me whether two sequences are concordant or not.

I want to add an additional feature, if the two sequences are not concordant, I want it to tell me the position where the two sequences do not match up.

For example:

If sequence 1 is: GACTTTTTACTTTTTTG & sequence 2 is: GACCTTTTACTTTTTTG

it will tell me sequence 1 is not concordant to sequence 2, but I also want it to tell me the position of the non concordance is the 4th letter.

How can I get the program to do this?

Here is the code I have so far:

for (h1,s1),(h2,s2) in combinations(zip(header,sequence),2):
    if s1[start:stop]==s2[start:stop]:
        print h1, "is concordant to", h2
    else:
        print h1, "is not concordant to", h2
        nonconcordance_position=[]
        nonconcordance_position.append(idx2[n-1])
        print "position of non concordance:", nonconcordance_position

When I run this it works, however it does not give the correct position.

  • If concordance means whether the strings are exactly the same, then this is the same as finding the common prefix of two strings. If so, http://stackoverflow.com/a/6718435/85337 is a good way to do that. Specifically using `os.path.commonprefix`. – Justin Anderson Aug 11 '16 at 15:44

1 Answers1

0

In your else statement you could loop through the sequence string and then print the index when the characters don't match.

Something like:

else:
    for i in range(0, len(h1)):
        if h1[i] == h2[i]:
            continue
        else:
            print i
            break