I'm trying to download from the logins by adding a suffix: the for loop works well i.e. I managed to download the sequences until this url does not exist and the execution stops. So I want that if this url doesn't exist, the script finishes running. Also, I would like to please also put all the output files in a folder that takes the name of the species written in "input". Thanking you. Here is a part of the script:
species = input("Bacteria species ? : ")
TypeSeq= input ("fna ? or faa ? :")
species = input("Bacteria species ? : ")
TypeSeq= input ("fna ? ou faa ?")
if data["#Organism/Name"].str.contains(species, case = False).any():
print(data.loc[data["#Organism/Name"].str.contains(species, case = False)]['Status'].value_counts())
FTP_list = data.loc[data["#Organism/Name"].str.contains(species, case = False)]["FTP Path"].values
if TypeSeq == "faa" :
try :
for url in FTP_list:
parts = urllib.parse.urlparse(url)
parts.path
posixpath.basename(parts.path)
suffix = "_protein.faa.gz"
prefix = posixpath.basename(parts.path)
print(prefix+suffix)
path = posixpath.join(parts.path, prefix+suffix)
ret = parts._replace(path=path)
sequence=wget.download(urllib.parse.urlunparse(ret))
except :
print ("")