I have the following (simplified) file:
<RESULTS>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,3</COLUMN>
</ROW>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,1</COLUMN>
</ROW>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,2</COLUMN>
</ROW>
</RESULTS>
What I am trying to achieve is to delete all ROW elements that match on the title, but do not match on the latest VERSION (in this case 1,3). So, what I have in mind is something like the following with sed:
sed -i '/<ROW>/,/<\/ROW>/<COLUMN NAME=\"TITLE\">title 1.*<COLUMN NAME=\"VERSION\">^1,3<\/COLUMN>/d' file
The expected output should be the following:
<RESULTS>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,3</COLUMN>
</ROW>
</RESULTS>
Unfortunately, this did not work, neither did anything that I tried. I searched a lot for similar issues, but nothing worked for me. Is there a way of achieving it with any Linux command line utility (sed, awk, etc)?
Thanks a lot in advance.