I have some xml that looks like:
<records>
<Customer>
<Reference>123</Reference>
<Name>John Smith</Name>
<Address1>1, The street</Address1>
<Address2>Upper Town Street</Address2>
<Address3>Anytown</Address3>
<Address4>County</Address4>
<PostCode>POS TCD</PostCode>
</Customer>
</records>
but for which Address2 is optional, so this is also valid:
<records>
<Customer>
<Reference>123</Reference>
<Name>John Smith</Name>
<Address1>1, The street</Address1>
<Address3>Anytown</Address3>
<Address4>County</Address4>
<PostCode>POS TCD</PostCode>
</Customer>
</records>
(Note: this is a cut down xml snippet)
I have the following regex that matches correctly when Address2 is specified:
<Reference>(?<Reference>.*)</Reference>[\w|\W]*<Name>(?<Name>.*)</Name>[\w|\W]*<Address1>(?<Address1>.*)</Address1>[\w|\W]*<Address2>(?<Address2>.*)</Address2>
It doesn't work for the case when Address2 isn't specified. The closest I've got is the following :
<Reference>(?<Reference>.*)</Reference>[\w|\W]*<Name>(?<Name>.*)</Name>[\w|\W]*<Address1>(?<Address1>.*)</Address1>[\w|\W]*(<Address2>(?<Address2>.*)</Address2>)?
which matches and populates Reference, Name and Address1 for both xml snippets, but which leaves Address2 blank in both cases rather than having a value of Upper Town Street for Address 2 for the first snippet.
Aside: I know that using an xml parser would be probably easier but the xml isn't clean and this was supposed to be a quick and easy solution(!). I also know that I can break this down into a set of regexs to resolve, but this has now become a bit of an intellectual challenge. And I'd love to have a solution to it.