-3

I am trying to apply regex pattern like this.

I want to apply pattern like this.

<a attributes="some set of attributes"><img attributes="some set of attribtes"/></a>

Rules:

    <a> tag with attributes followed by <img> with attributes. 

Sample Valid Data:

        <a xlink:href="some link" title="Image" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/1999/xhtml">
            <img  alt="No Image" title="No Image" xlink:href="soem path for image" xlink:title="Image" xmlns="http://www.w3.org/1999/xhtml" xmlns:xlink="http://www.w3.org/1999/xlink" />
        </a>

Invalid:

    <a>data<img/></a>--Data Present, no attributes
    <a><img>abcd</img></a>--data Present, No attributes
    <a><img/></a>---No attributes

Can any one suggest how to write pattern for this.

Thank you.

Patan
  • 17,073
  • 36
  • 124
  • 198

1 Answers1

0

You can do this in a completely bulletproof manner with XPath:

//*[local-name()='a' and count(@*)>0 and *[local-name()='img' and count(@*)>0] and count(.//*)=1 and normalize-space(.)='']

This selects all elements with a local name of 'a' which have no non-significant text content, attributes, and a single 'img' element with attributes.

However, since your example code is clearly XML with namespaces and all, perhaps you can reformulate your question to say what your overall task is instead of "what regex should I use". At the very least it seems that perhaps you should be paying attention to those namespaces instead of treating namespace declarations as attributes.

For example, maybe what you really mean is this?

//xhtml:a[@xlink:href and xhtml:img[@xlink:href] and count(.//*)=1 and normalize-space(.)='']
Francis Avila
  • 31,233
  • 6
  • 58
  • 96