I'm having some difficulty with my regex search, and I don't quite know why. I have a file with values formatted as such:
1 -1 2 SER HA H 4.477 0.003 1
2 -1 2 SER HB2 H 3.765 0.001 1
3 -1 2 SER HB3 H 3.765 0.001 1
4 -1 2 SER C C 173.726 0.2 1
5 -1 2 SER CA C 58.16 0.047 1
6 -1 2 SER CB C 64.056 0.046 1
7 0 3 HIS H H 8.357 0.004 1
8 0 3 HIS HA H 4.725 0.003 1
9 0 3 HIS HB2 H 3.203 0.003 2
.....
63 7 10 GLU HA H 4.328 0.004 1
64 7 10 GLU HB2 H 2.154 0.005 2
65 7 10 GLU HB3 H 2.156 0.004 2
66 7 10 GLU HG2 H 2.262 0.014 2
67 7 10 GLU HG3 H 2.464 0.001 2
68 7 10 GLU C C 177.242 0.2 1
69 7 10 GLU CA C 59.009 0.068 1
...
I want to search for the above strings exclusively line by line.
import re
with open('delete.txt') as file:
for lines in file:
modifier=lines.strip()
A=re.search('\B\d+\s[A-Z][A-Z][A-Z]\s[A-Z]',modifier)
if A != None:
search=A.string
print(search)
The formatting for the above files changes a lot, however what is always consistent is there will be a number, followed by 3 letters, followed by another letter. I.E. 2 SER HA
So I decided to use that as my regex search, but this isn't quite working. After the 63 7 10 GLU
line it works perfectly, but it doesn't find any of the other entries before that, despite the fact it appears every line is the same format.
The above example is a MVE.
Any help would be greatly appreciated!