I have a type of data file that contains only once (!) the following block of text:
Begin final coordinates
new unit-cell volume = 460.57251 a.u.^3 ( 68.24980 Ang^3 )
density = 7.37364 g/cm^3
CELL_PARAMETERS (alat= 7.29434300)
0.995319813 0.000000000 0.000000000
0.000000000 0.995319813 0.000000000
0.000000000 0.000000000 1.197882354
ATOMIC_POSITIONS (crystal)
Pb 0.0000000000 0.0000000000 -0.0166356359
O 0.5000000000 0.5000000000 0.1549702780
Ti 0.5000000000 0.5000000000 0.5327649171
O 0.0000000000 0.5000000000 0.6381882204
O 0.5000000000 0.0000000000 0.6381882204
End final coordinates
I have found how to extract the entire block of lines between the Begin final coordinates
and End final coordinates
patterns but I need to it to be more refined. I would like to extract first the three lines below the line starting with CELL_PARAMETERS
. Then I would like to extract (with another action not in the same awk command), the 5 lines below ATOMIC_POSITIONS.
I have to make an observation here: I said at the beginning the the block of text appears only once and this is true for that specific form with Begin final coordinates
and End final coordinates
. Throughout the data file there are many blocks with this form:
CELL_PARAMETERS (alat= 7.29434300)
0.995319813 0.000000000 0.000000000
0.000000000 0.995319813 0.000000000
0.000000000 0.000000000 1.197882354
ATOMIC_POSITIONS (crystal)
Pb 0.0000000000 0.0000000000 -0.0166356359
O 0.5000000000 0.5000000000 0.1549702780
Ti 0.5000000000 0.5000000000 0.5327649171
O 0.0000000000 0.5000000000 0.6381882204
O 0.5000000000 0.0000000000 0.6381882204
So unfortunately I cannot just use the CELL_PARAMETERS
and ATOMIC_POSITIONS
lines as patterns. The only ones appearing only once are the Begin final coordinates
and End final coordinates
so I have to extract text relative to these lines.
I have tried to marry the method to extract lines between two patterns from here with the one for skipping N lines after finding pattern from here. Unfortunately I can't make it work.
So my idea was:
for the first case: I was trying to find the
Begin final coordinates
pattern and skip 5 lines including the one with the pattern) then print the 3 lines I am interested in and then skip the rest until theEnd final coordinates
.for the second case: find
Begin final coordinates
then skip the lines until ATOMIC_POSITIONS (skipping this one too), print the next 5 lines until theEnd final coordinates
.
Can this be done?
Update:
I have just tried this:
awk '/Begin final coordinates/ {n=NR+9} n < NR < n+3'
but i get syntax error:
awk: cmd. line:1: /Begin final coordinates/ {n=NR+9} n<NR<n+3
awk: cmd. line:1: ^ syntax error
What am i doing wrong here?
Update2:
Hold the presses, I got it!
- this solves the first case:
awk '/Begin final coordinates/{n=NR+4;m=NR+8} (n<NR) && (NR<m)' file
- this solves the second case:
awk '/Begin final coordinates/{n=NR+9;m=NR+8} (n<NR) && (NR<m)' file
Is not very nice but it will do the job!