I have a large DAT file of this type:
//
AC T00020
OS rat, Rattus norvegicus
BS R02959; HS$APOA1_02; Quality: 6; APOA1, G000203; human, Homo sapiens.
I have a large dat file of this type:
//
AC T00024
OS rat, Rattus norvegicus
BS R00135; HS$APOA1_01; Quality: 6; APOA1, G000203; human, Homo sapiens.
//
AC T00025
OS human, Homo sapiens
BS R02119; ANF$CONS_01; Quality: 4.
BS R02333; MOUSE$ALBU_12; Quality: 6; Alb, G000464; mouse, Mus musculus.
BS R02334; MOUSE$ALBU_13; Quality: 6; Alb, G000464; mouse, Mus musculus.
//
AC T00027
OS clawed frog, Xenopus
BS R02120; AP1$CONS; Quality: 6.
//
I first want to break it in modules where it starts and ends with '//' Then I want to keep only those modules having 'OS human, HomosSapiens' in them.
I am writing a python script to achieve this, but i am not able to break it in modules yet. I am trying it in Python 3.
Finally i want to keep this part of the dat file:
AC T00025
OS human, Homo sapiens
BS R02119; ANF$CONS_01; Quality: 4.
BS R02333; MOUSE$ALBU_12; Quality: 6; Alb, G000464; mouse, Mus musculus.
BS R02334; MOUSE$ALBU_13; Quality: 6; Alb, G000464; mouse, Mus musculus.