Searching for matching sequences in two gb files

Question

I have two Genbank files which I am extracting the genes doing the following:

genes_1 = []
for feature in sequence.features:
    if feature.type=='gene':
        genes_1.append(feature)

That is working just fine, I am able to obtain the sequence, the GC content and translation of the gene I need.

The second Genbank file is a very similar strain of bacteria. My idea is using the newly extracted sequence:

dnaA = hits[0]
extracted_sequence_1 = dnaA.extract(sequence_tuberculosis)

To do a search on the second Genbank file:

for extracted_sequence_1 in genes_2:
    for gene in genes_2.extract(sequence_2):
        if extracted_sequence_1 in genes_2.extract(sequences_2):
            print('Match')

However, as it is obvious, I am getting an error:

AttributeError: 'list' object has no attribute 'extract'

I have been trying to find this information on the bioPython instructions but there's nothing similar to what I need. Is there a way of doing this without running alignments?

The extract in sequence_2 simply takes the sequence of the genes to compare to the one that has been saved into extracted_sequence_1 — Harr1ls, Jun 20 '21 at 15:28
is genes.append(feature) supposed to be genes_1.append(feature) — pippo1980, Jun 20 '21 at 15:44
@pippo1980 yes you are right, I forgot to edit that, but that part is working anyway. I tried to summarise it in here — Harr1ls, Jun 20 '21 at 15:45
@Harr1ls we need to have the entire code with relevant imports to try to figure out something, example input and outputs would be good too. See: https://stackoverflow.com/help/how-to-ask, you can try to crosspost on stack bioinformatics tag biopython or biostar.org too. Nevertheless found interesting paper on dnaA https://journals.plos.org/plospathogens/article/authors?id=10.1371/journal.ppat.1009063 — pippo1980, Jun 20 '21 at 15:56

Searching for matching sequences in two gb files

0 Answers0