1

I've received a first version of a WSDL with his schemas whith the following type:

<xs:complexType name="AComplexType">
   <xs:sequence>
     <xs:element minOccurs="0" name="description" nillable="true" type="xs:string"/>
     <xs:element minOccurs="0" name="version" nillable="true" type="xs:int"/>
   </xs:sequence>
</xs:complexType>

<xs:complexType name="Response">
  <xs:sequence>
    <xs:element minOccurs="0" name="responseDescription" nillable="true" type="xs:int"/>
    <xs:element maxOccurs="unbounded" minOccurs="0" name="listOfElements" nillable="true" type="AComplexType"/>
  </xs:sequence>
</xs:complexType> 

The following xml is valid for the xsd above:

<Response>
  <responseDescription>A response description</responseDescription>
  <listOfElements>
    <description>An element descrition</description>
    <version>1</version>
    <description>Another element descrition</description>
    <version>1</version>
    ...
  </listOfElements>
</Response>

Also, I was able to create classes for this types with xjc, so it seems that this is a valid schema.

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "Response", propOrder = {
    "responseDescription",
    "listOfElements"
})
public class ConsultaExpedienteGATResponse {


    @XmlElementRef(name = "estado", namespace = "...", type = JAXBElement.class, required = false)
    protected JAXBElement<String> responseDescription;
    @XmlElement(nillable = true)
    protected List<AComplexType> listOfElements;

    ...
}

However, I thought that this kind of schemas were invalid and should be like this:

<xs:complexType name="Response">
  <xs:sequence>
    <xs:element minOccurs="0" name="responseDescription" nillable="true" type="xs:int"/>
      <xs:element name="listOfElements">
        <xs:complexType>
         <xs:sequence>
          <xs:element maxOccurs="unbounded" name="oneElement" type="AComplexType"/>
         </xs:sequence>
        </xs:complexType>
       </xs:element>
  </xs:sequence>
</xs:complexType> 

With this schema, the xml is slightly different:

<Response>
  <responseDescription>A response description</responseDescription>
  <listOfElements>
    <oneElement>
      <description>An element descrition</description>
      <version>1</version>        
    </oneElement>
    <oneElement>
      <description>Another element descrition</description>
      <version>1</version>        
    </oneElement>
    ...
  </listOfElements>
</Response>

So, I wonder if there are pros/cons for each option (for example better performance parsing the xml) or if one of the two is the adopted or default choice.

gabrielgiussi
  • 9,245
  • 7
  • 41
  • 71

3 Answers3

1

The two styles of writing a schema are sometimes called "venetian blind" and "russian doll". Google these terms and you will find plenty of people arguing which one is best in which circumstances. Like all matters of coding style, the discussion tends to generate more heat than light. Both are perfectly valid and neither is going to give any performance edge.

My own preference tends towards the "venetian blind" (with named global element declarations and/or named complex types, but not usually both); because global declarations are reusable. This is particularly useful if you are using schema-aware XSLT and XQuery because global element and type names can then be exploited in your XSLT/XQuery code.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
0

Performance will not be impacted by this decision.

Your former design relies up adjacency to associate descriptions with versions. This is not "invalid." Although you'll see this pattern in document-based XML (such as headings preceding paragraphs), it's non-ideal for data-oriented XML.

Instead make the association more explicit via hierarchy as you do in your latter design, or via attributes (especially for version, which won't likely require subordinate markup). See also XML attribute vs XML element.

Community
  • 1
  • 1
kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • Yes, it relies up in adjacency and that's why I thought it wasn't valid because it seems very "fragile". Also, when I talk about performance I was thinking in how the parser knows that one element ends and another one starts (it seems like more work that rely on start and end tags). In the other hand we can think in the first option being more bandwidth efficient due to fewer tags. – gabrielgiussi Oct 05 '16 at 14:08
  • I agree with you that the first option is better for data-oriented XML, but I lack strong reasons for rejecting the proposed schemas by the other team. – gabrielgiussi Oct 05 '16 at 14:11
  • The first option relies on adjacency and is better for ***document***-oriented XML. (Your comment has it backwards, perhaps a typo.) Go with the second option, which is better for data-oriented XML, which is what you have, and will map better to objects should you ever need to do so. Again, performance simply isn't the issue here. You'll never get to the point where markup overhead will become the bottleneck to your system. – kjhughes Oct 05 '16 at 14:50
  • Yes, was a typo. I meant _"I agree with you that the **last** option is better for data-oriented XML"_. Thanks kjhughes – gabrielgiussi Oct 05 '16 at 14:57
  • You're welcome. Please [**accept**](http://meta.stackoverflow.com/q/5234/234215) this answer if it's helped. Thanks. – kjhughes Oct 05 '16 at 22:29
0

Although the xml

<Response>
  <responseDescription>A response description</responseDescription>
  <listOfElements>
    <description>An element descrition</description>
    <version>1</version>
    <description>Another element descrition</description>
    <version>2</version>
    ...
  </listOfElements>
</Response>

was valid for the first schema specified, when it was unmarshaled into the classes (with JAXB annotations) generated by xjc, the property listOfElements was a list containing only the last element (in this case the element with description "Another element description" and version 2).

The valid xml for this schema is really

<Response>
  <responseDescription>A response description</responseDescription>
  <listOfElements>
    <description>An element descrition</description>
    <version>1</version>
  </listOfElements>
  <listOfElements>
    <description>Another element descrition</description>
    <version>2</version>
  </listOfElements>
</Response>

(The confusion comes from using an incorrect xml generated by hand by the developers of the WSDL, but then I've created a Request via SOAP UI for that WSDL and the error has became clear).

What is left is to define if it's better to have a root element for the list like

<Response>
  <responseDescription>A response description</responseDescription>
  <listOfElements>
    <AComplexType>
      <description>An element descrition</description>
      <version>1</version>
    </AComplexType>
    <AComplexType>
      <description>Another element descrition</description>
      <version>2</version>
    </AComplexType>
  </listOfElements>
</Response>

But to answer my original question, the first XML described was wrong.

gabrielgiussi
  • 9,245
  • 7
  • 41
  • 71