I wanted to remove part of the headings/annotations for a FASTA genome file so I could maintain only the locus tags and the protein description.
Eg. Convert:
lcl|CP000438.1_cds_ABJ14958.1_2 [gene=dnaN] [locus_tag=PA14_00020] [protein=DNA polymerase III, beta chain] [protein_id=ABJ14958.1] [location=2056..3159] [gbkey=CDS] ATGCATTTCACCATTCAACGCGAAGCCCTGTTGAAACCGCTGCAACTGGTCGCCGGCGTCGTGGAACGCC GCCAGACATTGCCGGTTCTCTCCAACGTCCTGCTGGTGGTCGAAGGCCAGCAACTGTCGCTGACCGGCAC
to :
[locus_tag=PA14_00020] [protein=DNA polymerase III, beta chain] ATGCATTTCACCATTCAACGCGAAGCCCTGTTGAAACCGCTGCAACTGGTCGCCGGCGTCGTGGAACGCC GCCAGACATTGCCGGTTCTCTCCAACGTCCTGCTGGTGGTCGAAGGCCAGCAACTGTCGCTGACCGGCAC
I would like to modify all the headers in my FASTA file in this manner. I only recently started learning python so I'm pretty lousy at writing the code for such tasks. I would greatly appreciate it if anyone could help.