does anyone have an idea how to automatically search and parse gbk files from FTP ncbi using either BIopython or BioJAVA. I have searched for the utilities in BIojava and have not found any. I have also tried BioPython and here is my Code:
from Bio import Entrez
Entrez.email = "test@yahoo.com"
Entrez.tool = "MyLocalScript"
handle = Entrez.esearch(db="nucleotide", term="Mycobacterium avium[Orgn]")
record = Entrez.read(handle)
print record
print record["Count"]
id_L = record["IdList"]
print id_L
print len(id_L)
However, there are only 3 mycobacterium avium species (whole genome sequences and fully annotated) the result I am getting is 59897.
Can anyone tell me how to perform the search either in BioJava or BioPython. Otherwise I will have to automate this process form scratch.
Thank you.