0

I've got an XML-file with lots of the following code:

<BankAccount code="NL18INGB0001234567">
<BankAccountType code="NL">
<Description/>

I need to replace code="NL" with code="IBA", but only when the BankAccount has INGB000 in it. I use the following sed command:

sed 'N;s/\(INGB000[0-9].*NL\)/\1_OUD/g;s/NL_OUD/IBA/g' file1.xml > file2.xml

The problem is that this command only replaces the first one but not all the other ones.

I expected the -g option to do a global match, but that didn't work.

I also tried:

sed ':a;N;ta;s/\(INGB000[0-9].*NL\)/\1_OUD/g;s/NL_OUD/IBA/g' file1.xml > file2.xml

What am I doing wrong?

Input:

<?xml version='1.0' encoding='UTF-8' ?>
<eExact xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="eExact-Schema.xsd">
    <Accounts>
        <Account code="1206" status="A" type="C">
            <Name>John Doe</Name>
            <Contacts>
                <Contact default="1" gender="M" status="A">
                    <LastName>Client: 10000</LastName>
                    <Initials/>
                    <Addresses>
                        <Address type="V">
                            <AddressLine1>one-way-street</AddressLine1>
                            <PostalCode>1000 AB</PostalCode>
                            <City>Simcity 1</City>
                            <Country code="NL"/>
                        </Address>
                    </Addresses>
                </Contact>
            </Contacts>
            <Debtor code="1206" number="1206">
                <BankAccounts>
                    <BankAccount code="NL93INGB0001234567">
                        <BankAccountType code="NL">
                            <Description/>
                        </BankAccountType>
                        <Bank code="">
                            <Name/>
                            <IBAN>NL93INGB0001234567</IBAN>
                        </Bank>
                        <SDDMandate>
                            <MndtId>02001234-0000004</MndtId>
                            <DtOfSgntr>2000-11-01</DtOfSgntr>
                            <LclInstrm>Core</LclInstrm>
                            <LastSDDDt/>
                        </SDDMandate>
                    </BankAccount>
                </BankAccounts>
                <SendReminder>1</SendReminder>
            </Debtor>
        </Account>

        <Account code="1123" status="A" type="C">
            <Name>Johny Doe</Name>
            <Contacts>
                <Contact default="1" gender="V" status="A">
                    <LastName>Client: 10001</LastName>
                    <Addresses>
                        <Address type="V">
                            <AddressLine1>one-way-street</AddressLine1>
                            <PostalCode>1000 AB</PostalCode>
                            <City>Simcity 2</City>
                            <Country code="NL"/>
                        </Address>
                    </Addresses>
                </Contact>
            </Contacts>
            <Debtor code="1123" number="1123">
                <BankAccounts>
                    <BankAccount code="NL25RABO0123456789">
                        <BankAccountType code="NL">
                            <Description/>
                        </BankAccountType>
                        <Bank code="">
                            <Name/>
                            <IBAN>NL25RABO0123456789</IBAN>
                        </Bank>
                        <SDDMandate>
                            <MndtId>02001234-0000003</MndtId>
                            <DtOfSgntr>2000-02-03</DtOfSgntr>
                            <LclInstrm>Core</LclInstrm>
                            <LastSDDDt/>
                        </SDDMandate>
                    </BankAccount>
                </BankAccounts>
                <SendReminder>1</SendReminder>
            </Debtor>
        </Account>
        <Account code="1109" status="A" type="C">
            <Name>Joan Doe</Name>
            <Contacts>
                <Contact default="1" gender="V" status="A">
                    <LastName>Client: 10002</LastName>
                    <Initials/>
                    <Addresses>
                        <Address type="V">
                            <AddressLine1>one-way-street</AddressLine1>
                            <PostalCode>1000 AB</PostalCode>
                            <City>Simcity 1</City>
                            <Country code="NL"/>
                        </Address>
                    </Addresses>
                </Contact>
            </Contacts>
            <Debtor code="1109" number="1109">
                <BankAccounts>
                    <BankAccount code="NL46RABO0123456789">
                        <BankAccountType code="NL">
                            <Description/>
                        </BankAccountType>
                        <Bank code="">
                            <Name/>
                            <IBAN>NL46RABO0123456789</IBAN>
                        </Bank>
                        <SDDMandate>
                            <MndtId>02001234-0000002</MndtId>
                            <DtOfSgntr>2000-11-01</DtOfSgntr>
                            <LclInstrm>Core</LclInstrm>
                            <LastSDDDt/>
                        </SDDMandate>
                    </BankAccount>
                </BankAccounts>
                <SendReminder>1</SendReminder>
            </Debtor>
        </Account>
        <Account code="1631" status="A" type="C">
            <Name>Flint</Name>
            <Contacts>
                <Contact default="1" gender="V" status="A">
                    <LastName>Client: 10003</LastName>
                    <Initials/>
                    <Addresses>
                        <Address type="V">
                            <AddressLine1>one-way-street</AddressLine1>
                            <PostalCode>1000 AB</PostalCode>
                            <City>Simcity 3</City>
                            <Country code="NL"/>
                        </Address>
                    </Addresses>
                </Contact>
            </Contacts>
            <Debtor code="1631" number="1631">
                <BankAccounts>
                    <BankAccount code="NL10INGB0001234567">
                        <BankAccountType code="NL">
                            <Description/>
                        </BankAccountType>
                        <Bank code="">
                            <Name/>
                            <IBAN>NL10INGB0001234567</IBAN>
                        </Bank>
                        <SDDMandate>
                            <MndtId>02001234-0000001</MndtId>
                            <DtOfSgntr>2000-07-05</DtOfSgntr>
                            <LclInstrm>Core</LclInstrm>
                            <LastSDDDt/>
                        </SDDMandate>
                    </BankAccount>
                </BankAccounts>
                <SendReminder>1</SendReminder>
            </Debtor>
        </Account>
    </Accounts>
</eExact>

Desired output

<?xml version="1.0" encoding="UTF-8"?>
<eExact xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="eExact-Schema.xsd">
  <Accounts>
    <Account code="1206" status="A" type="C">
      <Name>John Doe</Name>
      <Contacts>
        <Contact default="1" gender="M" status="A">
          <LastName>Client: 10000</LastName>
          <Initials/>
          <Addresses>
            <Address type="V">
              <AddressLine1>one-way-street</AddressLine1>
              <PostalCode>1000 AB</PostalCode>
              <City>Simcity 1</City>
              <Country code="NL"/>
            </Address>
          </Addresses>
        </Contact>
      </Contacts>
      <Debtor code="1206" number="1206">
        <BankAccounts>
          <BankAccount code="NL93INGB0001234567">
            <BankAccountType code="IBA">
              <Description/>
            </BankAccountType>
            <Bank code="">
              <Name/>
              <IBAN>NL93INGB0001234567</IBAN>
            </Bank>
            <SDDMandate>
              <MndtId>02001234-0000004</MndtId>
              <DtOfSgntr>2000-11-01</DtOfSgntr>
              <LclInstrm>Core</LclInstrm>
              <LastSDDDt/>
            </SDDMandate>
          </BankAccount>
        </BankAccounts>
        <SendReminder>1</SendReminder>
      </Debtor>
    </Account>
    <Account code="1123" status="A" type="C">
      <Name>Johny Doe</Name>
      <Contacts>
        <Contact default="1" gender="V" status="A">
          <LastName>Client: 10001</LastName>
          <Addresses>
            <Address type="V">
              <AddressLine1>one-way-street</AddressLine1>
              <PostalCode>1000 AB</PostalCode>
              <City>Simcity 2</City>
              <Country code="NL"/>
            </Address>
          </Addresses>
        </Contact>
      </Contacts>
      <Debtor code="1123" number="1123">
        <BankAccounts>
          <BankAccount code="NL25RABO0123456789">
            <BankAccountType code="NL">
              <Description/>
            </BankAccountType>
            <Bank code="">
              <Name/>
              <IBAN>NL25RABO0123456789</IBAN>
            </Bank>
            <SDDMandate>
              <MndtId>02001234-0000003</MndtId>
              <DtOfSgntr>2000-02-03</DtOfSgntr>
              <LclInstrm>Core</LclInstrm>
              <LastSDDDt/>
            </SDDMandate>
          </BankAccount>
        </BankAccounts>
        <SendReminder>1</SendReminder>
      </Debtor>
    </Account>
    <Account code="1109" status="A" type="C">
      <Name>Joan Doe</Name>
      <Contacts>
        <Contact default="1" gender="V" status="A">
          <LastName>Client: 10002</LastName>
          <Initials/>
          <Addresses>
            <Address type="V">
              <AddressLine1>one-way-street</AddressLine1>
              <PostalCode>1000 AB</PostalCode>
              <City>Simcity 1</City>
              <Country code="NL"/>
            </Address>
          </Addresses>
        </Contact>
      </Contacts>
      <Debtor code="1109" number="1109">
        <BankAccounts>
          <BankAccount code="NL46RABO0123456789">
            <BankAccountType code="NL">
              <Description/>
            </BankAccountType>
            <Bank code="">
              <Name/>
              <IBAN>NL46RABO0123456789</IBAN>
            </Bank>
            <SDDMandate>
              <MndtId>02001234-0000002</MndtId>
              <DtOfSgntr>2000-11-01</DtOfSgntr>
              <LclInstrm>Core</LclInstrm>
              <LastSDDDt/>
            </SDDMandate>
          </BankAccount>
        </BankAccounts>
        <SendReminder>1</SendReminder>
      </Debtor>
    </Account>
    <Account code="1631" status="A" type="C">
      <Name>Flint</Name>
      <Contacts>
        <Contact default="1" gender="V" status="A">
          <LastName>Client: 10003</LastName>
          <Initials/>
          <Addresses>
            <Address type="V">
              <AddressLine1>one-way-street</AddressLine1>
              <PostalCode>1000 AB</PostalCode>
              <City>Simcity 3</City>
              <Country code="NL"/>
            </Address>
          </Addresses>
        </Contact>
      </Contacts>
      <Debtor code="1631" number="1631">
        <BankAccounts>
          <BankAccount code="NL10INGB0001234567">
            <BankAccountType code="IBA">
              <Description/>
            </BankAccountType>
            <Bank code="">
              <Name/>
              <IBAN>NL10INGB0001234567</IBAN>
            </Bank>
            <SDDMandate>
              <MndtId>02001234-0000001</MndtId>
              <DtOfSgntr>2000-07-05</DtOfSgntr>
              <LclInstrm>Core</LclInstrm>
              <LastSDDDt/>
            </SDDMandate>
          </BankAccount>
        </BankAccounts>
        <SendReminder>1</SendReminder>
      </Debtor>
    </Account>
  </Accounts>
</eExact>

This doesn't work:

sed '/BankAccount.*INGB000/,$ s/BankAccountType code="NL"/BankAccountType code="IBA"/g' file1.xml > file2.xml

This replaces all the code=NL after the first INGB000.

Apojoost
  • 127
  • 10

2 Answers2

1
sed '/BankAccount.*INGB000/,$ s/code="NL"/code="IBA"/g' file1.xml > file2.xml

Or if you meant that you want to change code="NA... to code="IBA..., omit the closing quotes:

sed '/BankAccount.*INGB000/,$ s/code="NL/code="IBA/g' file1.xml > file2.xml

EDIT:

I'm still guessing at the output you want, but try this:

sed '/BankAccount code=".*INGB000/{N;s/code="NL"/code="IBA"/;}' file1.xml > file2.xml
Beta
  • 96,650
  • 16
  • 149
  • 150
  • Not good. In the XML is another branch: Your code also substitutes the "NL" in this branch and I don't want that. – Apojoost Sep 15 '16 at 09:36
  • @Apojoost: When you said _"all the other ones"_, I assumed you meant all the other ones. Perhaps you could edit your question to add a better example of the input, along with the output you want. – Beta Sep 15 '16 at 23:36
  • Sorry, this worked if I change it More specific: sed '/BankAccount.*INGB000/,$ s/BankAccountType code="NL"/BankAccountType code="IBA"/g' DEBITR_2016.08\ file1 > file2 – Apojoost Sep 17 '16 at 14:52
  • I thought is worked but I didn't look at the outcome properly. Your answer replaces all the code=NL after the first INGB000. Can you help me a little bit further? – Apojoost Oct 19 '16 at 19:37
  • could you please help me again? – Apojoost Oct 24 '16 at 21:57
  • @Apojoost: Your question contains an example of input. **Could you edit the question to add the corresponding desired output?** – Beta Oct 25 '16 at 02:14
  • Sorry for the delay. I updated my question with the input file and the desired output. Hope it helps. – Apojoost Nov 06 '16 at 21:09
  • @Apojoost: **The sed command in my edit of Oct 19 turns your input into your desired output** (apart from the whitespace differences you left in, and the trivial differences in the first line). I give up. – Beta Nov 07 '16 at 15:26
  • Sorry. You're right. I didn't see the post from 19 Oct. It works. – Apojoost Nov 07 '16 at 20:57
1

What am I doing wrong?

In my opinion you're using the wrong tool for the job. In most cases, you shouldn't try to parse XML with regex.

For this minor change you could use something like xmlstarlet...

XML Input

<doc>
    <BankAccount code="NL18INGB000ABCDEFG">
        <BankAccountType code="NL">
            <Description/>
        </BankAccountType>
    </BankAccount>
    <BankAccount code="NL18INGBXXX1234567">
        <BankAccountType code="NL">
            <Description/>
        </BankAccountType>
    </BankAccount>
    <BankAccount code="NL18INGB0001234567">
        <BankAccountType code="NL">
            <Description/>
        </BankAccountType>
    </BankAccount>
</doc>

xmlstarlet command

xml.exe ed -u "//BankAccount[contains(@code,'INGB000')]/BankAccountType/@code" -v "IBA" input.xml 

XML Output

<doc>
  <BankAccount code="NL18INGB000ABCDEFG">
    <BankAccountType code="IBA">
      <Description/>
    </BankAccountType>
  </BankAccount>
  <BankAccount code="NL18INGBXXX1234567">
    <BankAccountType code="NL">
      <Description/>
    </BankAccountType>
  </BankAccount>
  <BankAccount code="NL18INGB0001234567">
    <BankAccountType code="IBA">
      <Description/>
    </BankAccountType>
  </BankAccount>
</doc>

You could also use XSLT. This is what I prefer.

XSLT 1.0 (Could be processed using xmlstarlet or another processor. I used Saxon-HE (java from the command line) for testing.)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="BankAccount[contains(@code,'INGB000')]/BankAccountType/@code">
    <xsl:attribute name="code">IBA</xsl:attribute>
  </xsl:template>

</xsl:stylesheet>

Output is the same using the input from the xmlstarlet example.

Community
  • 1
  • 1
Daniel Haley
  • 51,389
  • 6
  • 69
  • 95