0

Imagine I have the following piece of text:

<Data>
    <Country>
       <Name>Portugal<\Name>
       <Population>10M</Population>
       <Sub>
          <Code>Y</Code>
       </Sub>
    </Country>
    <Country>
       <Name>Spain<\Name>
       <Population>30M</Population>
       <Sub>
          <Code>Y</Code>
       </Sub>
    </Country>
</Data>

How can I replace the Y to N from Country Portugal without replacing the Code from the remaining countries?

I've tried to use sed:

sed -i '/<Country>Portugal<\/Country>/{s/Y/N/;}' file.xml

but this is not replacing anything.

Can you tell me what I am doing wrong? How can I replace the first occurrence of Y AFTER matching the Portugal ?

Thanks!

4 Answers4

2

Avoid parsing XML with regex. Use an XML processing tool like xmlstarlet:

$ cat foo.xml
<Data>
  <Country>
    <Name>Portugal</Name>
    <Population>10M</Population>
    <Sub>
      <Code>Y</Code>
    </Sub>
  </Country>
  <Country>
    <Name>Spain</Name>
    <Population>30M</Population>
    <Sub>
      <Code>Y</Code>
    </Sub>
  </Country>
</Data>

$ xmlstarlet edit --update '/Data/Country[Name="Portugal"]/Sub/Code' -v "N" foo.xml
<?xml version="1.0"?>
<Data>
  <Country>
    <Name>Portugal</Name>
    <Population>10M</Population>
    <Sub>
      <Code>N</Code> 
    </Sub>
  </Country>
  <Country>
    <Name>Spain</Name>
    <Population>30M</Population>
    <Sub>
      <Code>Y</Code>
    </Sub>
  </Country>
</Data>
that other guy
  • 116,971
  • 11
  • 170
  • 194
  • Thanks! The reason why I was trying with regex is that I do not have access to xmlstarlet and I have no means to install it. however, I have xmllint. You think it would also be possible with it? – CarlosBernardes Jul 10 '18 at 21:25
0

Use a range match.

sed '/<Name>Portugal</,/<\/Country>/ s/<Code>Y</<Code>N</' file.xml

(Edited to match updated requirements.)

Paul Hodges
  • 13,382
  • 1
  • 17
  • 36
  • Thanks for your help!! However, I made a mistake in the question.. :( With your solution, that would be OK, but I forgot that is parent and Portugal comes inside the tags – CarlosBernardes Jul 10 '18 at 21:03
0

This might work for you (GNU sed):

sed '/<Country>/{:a;N;/<\/Country>/!ba;/Portugal/s/Y/N/}' /file

Gather up the lines for a Country then match those lines to contain Portugal and replace the first Y with N.

potong
  • 55,640
  • 6
  • 51
  • 83
0

If your input is really always exactly that format then all you need is:

$ awk '/<Name>/{f=/Portugal/} f && /<Code>/{sub(/Y/,"N")} 1' file
<Data>
    <Country>
       <Name>Portugal<\Name>
       <Population>10M</Population>
       <Sub>
          <Code>N</Code>
       </Sub>
    </Country>
    <Country>
       <Name>Spain<\Name>
       <Population>30M</Population>
       <Sub>
          <Code>Y</Code>
       </Sub>
    </Country>
</Data>
Ed Morton
  • 188,023
  • 17
  • 78
  • 185