I would like to delete text lines that do not meet a multiple condition. This is and example:
">"nxp:NX_A0A075B6H9-1 \DbUniqueId=NX_A0A075B6H9-1 \PName=Immunoglobulin lambda variable 4-69 isoform Iso 1 \GName=IGLV4-69 \NcbiTaxId=9606 \TaxName=Homo Sapiens \Length=119 \SV=1 \EV=19 \PE=1 \ModRes=(42||Disulfide) MAWTPLLFLTLLLHCTGSLSQLVLTQSPSASASLGASVKLTCTLSSGHSSYAIAWHQQQP EKGPRYLMKLNSDGSHSKGDGIPDRFSGSSSGAERYLTISSLQSEDEADYYCQTWGTGI
">"nxp:NX_A0A075B6I0-1 \DbUniqueId=NX_A0A075B6I0-1 \PName=Immunoglobulin lambda variable 8-61 isoform Iso 1 \GName=IGLV8-61 \NcbiTaxId=9606 \TaxName=Homo Sapiens \Length=122 \SV=7 \EV=27 \PE=2 \ModRes=(46||Disulfide) MSVPTMAWMMLLLGLLAYGSGVDSQTVVTQEPSFSVSPGGTVTLTCGLSSGSVSTSYYPS WYQQTPGQAPRTLIYSTNTRSSGVPDRFSGSILGNKAALTITGAQADDESDYYCVLYMGS GI
">"nxp:NX_A0A075B6I1-1 \DbUniqueId=NX_A0A075B6I1-1 \PName=Immunoglobulin lambda variable 4-60 isoform Iso 1 \GName=IGLV4-60 \NcbiTaxId=9606 \TaxName=Homo Sapiens \Length=120 \SV=1 \EV=20 \PE=1 \ModRes=(43||Disulfide) MAWTPLLLLFPLLLHCTGSLSQPVLTQSSSASASLGSSVKLTCTLSSGHSSYIIAWHQQQ PGKAPRYLMKLEGSGSYNKGSGVPDRFSGSSSGADRYLTISNLQFEDEADYYCETWDSNT
I only want the lines that meet the condition of PE =2, PE=5 or PE=4
I try to do this using this code:
list= []
for line in open("nextprot_all.fasta","r"):
if line.startswith(">") and "PE=2" or "PE=4" or "PE=5" in line:
list.append(line)
with open('test_1.txt', 'w') as output:
for i in list:
output.write(i)
The problem is that with this code I just get in the new file the first line and not the rest of the text.
Is there any way to catch the text between two ">" when the condition is True?
The result that I'd like to have is this:
">"nxp:NX_A0A075B6I0-1 \DbUniqueId=NX_A0A075B6I0-1 \PName=Immunoglobulin lambda variable 8-61 isoform Iso 1 \GName=IGLV8-61 \NcbiTaxId=9606 \TaxName=Homo Sapiens \Length=122 \SV=7 \EV=27 \PE=2 \ModRes=(46||Disulfide) MSVPTMAWMMLLLGLLAYGSGVDSQTVVTQEPSFSVSPGGTVTLTCGLSSGSVSTSYYPS WYQQTPGQAPRTLIYSTNTRSSGVPDRFSGSILGNKAALTITGAQADDESDYYCVLYMGS GI
Thank you in advance
#Program fixed. My question wasn't about conditional. My question was not about conditionals but about how I could iterate the following lines.
list= []
First=False
with open("nextprot_all.peff", 'r') as infile:
for line in infile:
if line.startswith(">"):
if line.find("\PE=2") !=-1 or line.find("\PE=3") !=-1 or line.find("\PE=5") !=-1:
First=True
else:
First=False
if First:
list.append(line)
with open('test_2.txt', 'w') as output:
for i in list:
output.write(i)