6

Im working to automate the testing of an API which takes and returns XML, so I want to translate the documented return data of the API into schema as much as possible. I chose RelaxNG for this task based on ease of use and learning.

Before I throw in all the info, here's the question:

Is it possible to describe "unordered set of elements, with the same name but different attributes" ?

Here is a sample object for what I'm having trouble describing:

<item>
    <id>d395136e-d060-4a6e-887c-c0337dd7ad09</id>
    <name>The item has a name</name>
    <link rel="self" type="type1" href="url" />
    <link rel="download" type="type2" href="url" />
    <link rel="relatedData" type="type3" href="url" />
</item>

The link objects are the bit that I'm getting hung up on. Here is the problem:

  • The order of elements inside item is not guaranteed, so I am trying to put all elements in <interleave> structure.
  • There will be multiple <link> elements inside <item>, with different sets of attributes (ie, <item> MUST have a 'self' link, a 'download' link, and a 'relatedData' link to be valid).
  • One of each link type is required, but again order is not guaranteed.

I tried to describe the schema like so:

<element name="item">
    <interleave>
        <element name="id"><text/></element>
        <element name="name"><text/></element>
        <ref name="selfLink"/>
        <ref name="launchLink"/>
        <ref name="thumbnailLink"/>
    </interleave>
</element>

the 'link' references are defined elsewhere like so:

 <define name="selfLink">
 <element name="link">
     <attribute name="href"><text/></attribute>
     <attribute name="rel"><value>self</value></attribute>
     <attribute name="type"><value>type1</value></attribute>
 </element>
 </define>

The parser is not pleased about this - from jing I get error: the element "link" can occur in more than one operand of "interleave". I can see what its getting at but I hoped it could handle the idea of 'elements with the same name but different attributes' as unique items.

Moving the link refs out of interleave gets it to parse, but I'll be waiting for the validator to blow up whenever the order changes in the returned data.

Any ideas, or is this impossible? Is there an inherent issue with the XML I am processing that will require me to move some of this up to higher processing logic in my test application (manually check for each link type after running a more generic XML validation?)

Pshemo
  • 122,468
  • 25
  • 185
  • 269
James
  • 88
  • 3
  • When you say "different attributes", do you actually mean "different attribute values"? – mzjn Aug 03 '12 at 19:41
  • Actually, yes, thank you for the clarification. Same set of attributes with different value requirements. – James Aug 06 '12 at 15:13

2 Answers2

3

It looks like you have stumbled upon a restriction on interleave in RELAX NG. I would try to do this in Schematron, or perhaps a combination of RELAX NG and Schematron.

Here is a snippet that checks your <link> elements using the version of Schematron that is supported by Jing:

<schema xmlns="http://www.ascc.net/xml/schematron">
  <pattern name="link pattern">
    <rule context="item">
      <assert test='count(link) = 3'>There must be 3 link elements.</assert>
      <assert test="count(link[@rel = 'self' and @type ='type1']) = 1">There must be 1 link element wwhere @rel='self' and @type='type1'.</assert>
      <assert test="count(link[@rel = 'download' and @type ='type2']) = 1">There must be 1 link element where @rel='download' and @type='type2'.</assert>
      <assert test="count(link[@rel = 'relatedData' and @type = 'type3']) = 1">There must be 1 link element where @rel='relatedData' and @type='type3'.</assert>
    </rule>
  </pattern>
</schema>
mzjn
  • 48,958
  • 13
  • 128
  • 248
  • 1
    This gets the idea, and I was able to implement it for this example and confirm it works. Having just dipped my toes into embedding schematron into RelaxNG, I think this might be the way to go for now until/unless I find a roadblock. I gain the ability to keep the xml validation in one place (not split between a validator and application logic) at the expense of needing to setup a more complex setup/processing pipeline. Seems a reasonable tradeoff. – James Aug 08 '12 at 22:05
1

See if the following schema helps

<grammar xmlns="http://relaxng.org/ns/structure/1.0">
<start>
    <element name="item">
        <interleave>
            <element name="id"><text/>
            </element>
            <element name="name"><text/></element>
            <oneOrMore>
                <ref name="link"/>
            </oneOrMore>
        </interleave>
    </element>
</start>

<define name="link">
    <element name="link">
        <attribute name="href"/>
        <choice>
            <group>
                <attribute name="rel"><value>self</value></attribute>
                <attribute name="type"><value>type1</value></attribute>
            </group>
            <group>
                <attribute name="rel"><value>download</value></attribute>
                <attribute name="type"><value>type2</value></attribute>
            </group>
            <group>
                <attribute name="rel"><value>relatedData</value></attribute>
                <attribute name="type"><value>type3</value></attribute>
            </group>
        </choice>
    </element>
</define>
</grammar>
  • While it doesnt get me to a guaranteed-valid check of my data, this idea gets me closer - by putting a `` in my interleave with X `link`s, I can match 'X link objects which might be grouped somewhere in the item', and by using your `choice` idea in the schema of `link`, I can validate that all `link`s found are from a set of valid link elements. Higher level app code will still have to check that the 3 links received are the actual three that are expected in the returned object ('got one download link, one self link, and one relatedData link in the `item`'). thanks. – James Aug 06 '12 at 21:48