I have a huge file that looks like this:
CAV-1 ATCTACTTCTATCG
CAV-2 GCGCGTAGCTAGCT
CAV-2 AAGCGCTCGTAAAA
CAV-3 AAATATATATATCC
Using Python, I want to delete the lines having a duplicate string, in this case "CAV-2". The first line having the string would remain. I would get this:
CAV-1 ATCTACTTCTATCG
CAV-2 GCGCGTAGCTAGCT
CAV-3 AAATATATATATCC
I know how to use regex and to parse through lines, but I am not able to do this specific task.
I know how to use