0

I know how to find the string of characters or word in a txt file but dont know how to find the exact position of the characters. For example:

GCATTCTGAGGCATTCTCTAACAGGTTCTCGACCCTCCGCCATGGCCCCGTGGATGCATCTCCTCACCGT
GCTGGCCCTGCTGGCCCTCTGGGGACCCAACTCTGTTCAGGCCTATTCCAGCCAGCACCTGTGCGGCTCC
AACCTAGTGGAGGCACTGTACATGACATGTGGACGGAGTGGCTTCTATAGACCCCACGACCGCCGAGAGC
TGGAGGACCTCCAGGTGGAGCAGGCAGAACTGGGTCTGGAGGCAGGCGGCCTGCAGCCTTCGGCCCTGGA
GATGATTCTGCAGAAGCGCGGCATTGTGGATCAGTGCTGTAATAACATTTGCACATTTAACCAGCTGCAG
AACTACTGCAATGTCCCTTAGACACCTGCCTTGGGCCTGGCCTGCTGCTCTGCCCTGGCAACCAATAAAC
CCCTTGAATGAG

This is the sequence and I have to find the position of these characters in the sequence:

TCGACCCTCCGCCAT

I've done this but don't know how to find the position of the characters start to end.

with open('sequence.txt') as file:
 contents = file.read()
search_word = input("enter the sequence u want to search in the file : ")
if search_word in contents:
        print ('SEQUENCE FOUND!')
else:
        print ('SEQUENCE NOT FOUND')
  • what are you expecting to be the format of returned position? – ParthS007 Feb 24 '21 at 15:21
  • start postion = "number here" end position = "number here" –  Feb 24 '21 at 15:22
  • 1
    `search_word.find("TCGACCCTCCGCCAT")` is the start, and then the end is just start plus the length of the substring. – Kraigolas Feb 24 '21 at 15:24
  • Unless you are *specifically* asking about how to solve a cross-version compatibility problem (in which case your question should obviously describe that problem) you should not mix the [tag:python-2.7] and [tag:python-3.x] tags. – tripleee Feb 24 '21 at 15:28

2 Answers2

0
with open('sequence.txt') as file:
   contents = file.read()
search_word = input("enter the sequence u want to search in the file : ")
start = contents.find(search_word)
end  = start + len(search_word)

However this only returns the first occurrence. To find all non overlapping occurrences you could do this:

positions = []
with open('sequence.txt') as file:
    contents = file.read()
search_word = input("enter the sequence u want to search in the file : ")
start = 0
while start != -1:
    # find returns -1 if it can find the word
    # so we run the loop as long as start is not -1
    start = contents.find(search_word)
    end  = start + len(search_word)
    if start != -1:
        positions.append((start, end))
        # in order to dont get the same position as in the loop before
        # we remove everything one position after the last occurrence
        contents = contents[start+1:]
Manuel
  • 546
  • 1
  • 5
  • 17
  • says pos is not defined –  Feb 24 '21 at 15:32
  • Sorry, forgot to edit something, there was another typo which I just fixed – Manuel Feb 24 '21 at 15:33
  • still gives the same error :( also can u add a comment in the cod and explain to me what u did in the 2nd part itd be very much appreciated –  Feb 24 '21 at 15:36
  • the error is gone but theres no output still not giving me the position –  Feb 24 '21 at 15:45
0

When you get a txt file's text it's a string, so just convert the string to a list and then read the list.

Lasher
  • 1
  • 2