-1

I have a large DAT file of this type:

//
AC  T00020
OS  rat, Rattus norvegicus
BS  R02959; HS$APOA1_02; Quality: 6; APOA1, G000203; human, Homo sapiens.
I have a large dat file of this type:
//
AC  T00024
OS  rat, Rattus norvegicus
BS  R00135; HS$APOA1_01; Quality: 6; APOA1, G000203; human, Homo sapiens.
//
AC  T00025
OS  human, Homo sapiens
BS  R02119; ANF$CONS_01; Quality: 4.
BS  R02333; MOUSE$ALBU_12; Quality: 6; Alb, G000464; mouse, Mus musculus.
BS  R02334; MOUSE$ALBU_13; Quality: 6; Alb, G000464; mouse, Mus musculus.
//
AC  T00027
OS  clawed frog, Xenopus
BS  R02120; AP1$CONS; Quality: 6.
//

I first want to break it in modules where it starts and ends with '//' Then I want to keep only those modules having 'OS human, HomosSapiens' in them.

I am writing a python script to achieve this, but i am not able to break it in modules yet. I am trying it in Python 3.

Finally i want to keep this part of the dat file:

AC  T00025
OS  human, Homo sapiens
BS  R02119; ANF$CONS_01; Quality: 4.
BS  R02333; MOUSE$ALBU_12; Quality: 6; Alb, G000464; mouse, Mus musculus.
BS  R02334; MOUSE$ALBU_13; Quality: 6; Alb, G000464; mouse, Mus musculus.
Timothy
  • 2,004
  • 3
  • 23
  • 29
Amy
  • 55
  • 4

1 Answers1

0

Open file and read the contents, (not line by line) using f.read().

Split by a chosen character or string.

# puts each text block as an item in a list
items = s.split('//')

Write the results.

Community
  • 1
  • 1
Doron Cohen
  • 1,026
  • 8
  • 13
  • I am new to programming. I have read and opened the file. I want to break the lines in between the '//' into diff parts. How should i do it? – Amy Jun 13 '16 at 07:22
  • Added the code. You should split the string with `'//'` as a delimiter. then you can pick an item like this `items[2]` or print it all like this `for item in items: print(item)` – Doron Cohen Jun 13 '16 at 07:34
  • I am now not able to go to the next line.. Can you please solve for the entire problem mentioned above. `code` file = "file.dat" with open(file, 'r') as f: line = f.readline() items = line.split('//') for item in items: print(item) It obviously gets stuck at the first line itself and doesn't proceed to the next line.. Also I must retain those blocks which have OS - human.. – Amy Jun 13 '16 at 07:50
  • Don't read one line, read the whole file. I am sorry but I cannot solve the entire problem. Stack Overflow is not for asking other people to write code for you. You have a simple task of reading the whole file like `data = f.read()` and then manipulate it with `data.split('//')` and other string methods python offer. Good luck! – Doron Cohen Jun 13 '16 at 08:58
  • Understood my mistake. I was trying to take a single line. – Amy Jun 13 '16 at 09:50
  • @Amy I edited the answer, If it satisfies you please accept it. Thanks. – Doron Cohen Jun 13 '16 at 13:42