2

I am trying to use ColdFusion's XMLValidate() function to validate an XML Document against the ONIX XSD Schema. ONIX is a 'standard' XML format used by some aspects of the book publishing industry.

This is a sample XML document [I modified some data for Client NDA type reasons; sorry for the length]

<?xml version="1.0"?>
<ONIXMessage release="3.0">
  <Header>       
    <Sender>
      <SenderName>Me</SenderName>   
    </Sender>
    <SentDateTime>20131030T090000Z</SentDateTime>
    <MessageNote>My Test for SO</MessageNote>
  </Header>
  <Product>  
    <RecordReference>12345</RecordReference>     
    <NotificationType>03</NotificationType>
    <RecordSourceType>01</RecordSourceType>
    <RecordSourceName>Me</RecordSourceName>   
    <ProductIdentifier>
      <ProductIDType>15</ProductIDType>
      <IDValue>12324567801011</IDValue>
    </ProductIdentifier>
    <DescriptiveDetail>
      <ProductComposition>00</ProductComposition>
      <ProductForm>ED</ProductForm>
      <ProductFormDetail>E101</ProductFormDetail>
      <ProductFormDetail>E127</ProductFormDetail>
      <PrimaryContentType>10</PrimaryContentType>
      <EpubTechnicalProtection>01</EpubTechnicalProtection>
      <Collection>
        <CollectionType>10</CollectionType>
        <CollectionSequence>
          <CollectionSequenceType>02</CollectionSequenceType>
          <CollectionSequenceNumber>11</CollectionSequenceNumber>
        </CollectionSequence>
        <TitleDetail>
           <TitleType>01</TitleType>
           <TitleElement>
              <TitleElementLevel>02</TitleElementLevel>
              <TitlePrefix><![CDATA[The]]></TitlePrefix>
              <TitleWithoutPrefix><![CDATA[Something]]></TitleWithoutPrefix>
           </TitleElement>
        </TitleDetail>
      </Collection>    
      <TitleDetail>
        <TitleType>01</TitleType>
        <TitleElement>
          <SequenceNumber>1</SequenceNumber>
          <TitleElementLevel>01</TitleElementLevel>
          <TitleText>
            <![CDATA[ForSO]]>
          </TitleText>
        </TitleElement>
        <TitleStatement><![CDATA[The Something for SO]]></TitleStatement>
      </TitleDetail>
      <Contributor>  
        <SequenceNumber>1</SequenceNumber>
        <ContributorRole>A01</ContributorRole>
        <PersonName>Me, Myself</PersonName>
        <PersonNameInverted>Myself, Me</PersonNameInverted>
        <NamesBeforeKey>Myself</NamesBeforeKey>
        <KeyNames>Me</KeyNames>
      </Contributor>
      <Contributor>
        <SequenceNumber>2</SequenceNumber>
        <ContributorRole>A01</ContributorRole>
        <PersonName>Someone Else</PersonName>
        <PersonNameInverted>Else, Someone</PersonNameInverted>
        <NamesBeforeKey>Someone</NamesBeforeKey>
        <KeyNames>Else</KeyNames>
      </Contributor>
      <ContributorStatement>Me Myself and Someone Else</ContributorStatement>
      <Language>
        <LanguageRole>01</LanguageRole>
        <LanguageCode>eng</LanguageCode>
      </Language>
      <Extent>                              
        <ExtentType>00</ExtentType>
        <ExtentValue>40</ExtentValue>
        <ExtentUnit>03</ExtentUnit>
      </Extent>
      <Subject>
        <MainSubject/>
        <SubjectSchemeIdentifier>10</SubjectSchemeIdentifier>
        <SubjectCode>JUV001000</SubjectCode>
      </Subject>
      <Audience>
        <AudienceCodeType>01</AudienceCodeType>
        <AudienceCodeValue>02</AudienceCodeValue>
      </Audience>
      <AudienceRange>
        <AudienceRangeQualifier>17</AudienceRangeQualifier>
        <AudienceRangePrecision>03</AudienceRangePrecision>
        <AudienceRangeValue>8</AudienceRangeValue>
        <AudienceRangePrecision>04</AudienceRangePrecision>
        <AudienceRangeValue>12</AudienceRangeValue>
      </AudienceRange>
    </DescriptiveDetail>
    <CollateralDetail>
      <TextContent>
        <TextType>03</TextType>
        <ContentAudience>00</ContentAudience>
        <Text textformat="02">
          <![CDATA[Something, Something, Something, Dark Side]]>
        </Text>
      </TextContent>
    </CollateralDetail>
    <PublishingDetail>
      <Imprint>
        <ImprintName>Fake Publisher</ImprintName>
      </Imprint>
      <Publisher>
        <PublishingRole>01</PublishingRole>
        <PublisherName>Fake Publisher</PublisherName>
      </Publisher>
      <PublishingStatus>02</PublishingStatus>
      <PublishingDate>
        <PublishingDateRole>01</PublishingDateRole>
        <DateFormat>00</DateFormat>
        <Date>20110701</Date>
      </PublishingDate> 
      <SalesRights>                                  
        <SalesRightsType>02</SalesRightsType>
        <Territory>                                         
          <CountriesIncluded>GB AU NZ</CountriesIncluded>
        </Territory>
      </SalesRights>
      <SalesRestriction>
        <SalesRestrictionType>04</SalesRestrictionType>
        <SalesOutlet>
          <SalesOutletIdentifier>
            <SalesOutletIDType>03</SalesOutletIDType>
            <IDValue>AMZ</IDValue>
          </SalesOutletIdentifier>
          <SalesOutletName>Amazon</SalesOutletName>
        </SalesOutlet>
        <SalesOutlet>
          <SalesOutletIdentifier>
            <SalesOutletIDType>03</SalesOutletIDType>
            <IDValue>BNO</IDValue>
          </SalesOutletIdentifier>
          <SalesOutletName>Barnes And Noble</SalesOutletName>
        </SalesOutlet>
      </SalesRestriction>
    </PublishingDetail>
    <ProductSupply>
      <SupplyDetail>
        <Supplier>
          <SupplierRole>01</SupplierRole>
          <SupplierName>Me</SupplierName>
        </Supplier>
        <ProductAvailability>20</ProductAvailability>
        <Price>                          
          <PriceAmount>12.99</PriceAmount>
          <CurrencyCode>USD</CurrencyCode>
          <Territory>
            <CountriesIncluded>GB AU NZ</CountriesIncluded>
          </Territory>
        </Price>
      </SupplyDetail>
    </ProductSupply>
  </Product>
</ONIXMessage>  

Here is some sample code to validate the above Document against the schema. To run this code you'll have to download the scheme from the link above and unzip the schema documents into the root directory of your web server. Save the above XML as a file named sample.xml; you may have to modify the file path on the cffile tag.

<cffile action="read" file="Sample.xml" variable="xmlFileResults" >
<Cfset xmlValidateResults = xmlValidate(xmlFileResults,'#cgi.http_host#/ONIX_BookProduct_3.0_reference.xsd') />
<cfdump var="#xmlValidateResults#" /><Br/><br/>

It provides this error:

[Error] :2:28: cvc-elt.1: Cannot find the declaration of element 'ONIXMessage'.

If I translate the error correctly, it cannot find the ONIXMessage tag. But, I'm rather confused as to why.

If I take a simpler XML file:

<?xml version="1.0"?>
<ONIXMessage release="3.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
</ONIXMessage>  

I still get the same error.

I'm trying to determine whether their is an error with the XML or the XSD or CFs validation functionality.

Has anyone seen this? Does anyone have any insight?


Based on comments, I wanted to add this:

I have no idea if this helps debug; but if I change the XML Header to reference this:

<ONIXMessage release="3.0" xmlns="http://ns.editeur.org/onix/3.0/reference"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
            xsi:schemaLocation="http://ns.editeur.org/onix/3.0/reference http://myserver/ONIX_BookProduct_3.0_reference.xsd"
            >

and remove the XSD reference from the XMLValidate() function:

<cffile action="read" file="Sample.xml" variable="xmlFileResults" >
<Cfset xmlValidateResults = xmlValidate(xmlFileResults) />
<cfdump var="#xmlValidateResults#" /><Br/><br/>

I get a slew of different errors, although they seem valid. It has no trouble finding the declaration of the ONIXMesessage; but does highlight a lot of other errors. [I'm not sure if they are valid yet or not].

Unfortunately, It is not, practical, in our environment to have to "hard code" a schema location in the XML document, though.

JeffryHouser
  • 39,401
  • 4
  • 38
  • 59
  • 1
    You may already know this but I found an [online ONIX file validator](http://www.readyet.net/tools/onixFileValidation/3). You can use that to verify the XML/XSD outside of ColdFusion. I tried the sample that you posted but it failed on the bits you modified to post here, `TitleText`. Not sure if you can try an actual version on that site or at least one with more production-like data. – Miguel-F Dec 09 '13 at 20:48
  • 1
    @Miguel-F I did not already know that. Thanks for the link. I see the same errors you are seeing when I upload a "real" one. But, that doesn't address the current errors. Maybe the problem truly is with CF's validation library. – JeffryHouser Dec 09 '13 at 21:12
  • 1
    I was hoping that link would at least validate the XML for you. Not sure what to think of it failing over there as well (albeit with a different error). Can't really tell for sure what XSD they are using either. Presumably the same one you are for version 3. I will try to download the XSD and try your sample code to see if I get the same results. Stay tuned... – Miguel-F Dec 09 '13 at 21:21
  • I downloaded and tested your samples with ColdFusion 9.0.1 and got the same error. Then something occurred to me and I wonder if you are having the same issue. Can you surf to the XSD file on your server? I am running IIS on Windows 2008 Server and the .xsd file extension is not allowed by default. So when I attempt to browse the file I get a 404. So I assume the `XMLValidate()` function cannot browse the XSD either and so it fails without a very useful error message. It's a shot in the dark... – Miguel-F Dec 09 '13 at 21:38
  • @Miguel-F Yes, I can surf to the XSD on my web server without issue. I'm using Apache [for my local dev] and it loaded right up w/o issue. Awesome idea, though, and I may run into that issue when pushing into our shared dev server, later. I suspect that if I can load it in a browser, that CF would have no problem loading the same file. Once I remove CF from the equation; there are a few more options come up from results ( http://stackoverflow.com/questions/13310637/cvc-elt-1-cannot-find-the-declaration-of-element-myelement ) but unfortunately did not solve the issue. – JeffryHouser Dec 09 '13 at 21:49
  • Yes I agree, if you can browse the XSD then I assume CF can as well. In my simple testing I am guessing that CF is having an issue with the XSD for some reason. Reason being, I get the same validation error when I omit the XSD reference entirely from the `xmlValidate()` call. – Miguel-F Dec 09 '13 at 21:57
  • If I use CFHTTP to load the XSD it properly loads it and I can display it to the screen; so I believe it is safe to assume that my CF Instance can properly access the XSD file. If I add the schema location to the XML file and remove it from the xmlValidate() I get different results [which I am currently validation for correctness]. I edited my question to reflect this 'new' development. – JeffryHouser Dec 09 '13 at 22:57
  • Okay; the problem appears to relate to the use of the `xs:include` in the `ONIX_BookProduct_3.0_reference.xsd` file. Combining them to a "Single" File appears to solve a lot of the validation errors. I'm doing more testing now. – JeffryHouser Dec 09 '13 at 23:11

1 Answers1

0

I was able to solve this issue. I had to do a few things.

First, the original schema made use of includes to include other files:

<xs:include schemaLocation="ONIX_BookProduct_CodeLists.xsd"/>
<xs:include schemaLocation="ONIX_XHTML_Subset.xsd"/>

I had to remove these includes and combine the three files into a single file for the validation to work. I do not know if it was an issue with the validator finding the files--due to the local path--or an issue the validation routine has with includes. Creating a single schema without the use of includes solved the issue.

I also had to add the schema location to the XML Header. This was the original:

<ONIXMessage release="3.0">

And this is the modified header:

<ONIXMessage xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" release="3.0"
             xmlns="http://ns.editeur.org/onix/3.0/reference"  
             xsi:schemaLocation="http://ns.editeur.org/onix/3.0/reference http://mydomain.com/ONIX_BookProduct_3.0_reference.xsd">

Then the validation works successfully--or at least it appears to be; I'm still getting errors but I feel they are errors in the XML documents I'm trying to validate.

<cffile action="read" file="Sample.xml" variable="xmlFileResults" >
<Cfset xmlValidateResults = xmlValidate(xmlFileResults) />
<cfdump var="#xmlValidateResults#" /><Br/><br/>

In order to prevent changes to the process creating the files; I'm using this code to add the schema information to the header "on the fly" before we perform validation:

<cfset sourceString = "<ONIXMessage ([^<>]+)?>" />
<cfset replacementString = "<ONIXMessage xmlns:xsd=""http://www.w3.org/2001/XMLSchema"" xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" release=""3.0"" xmlns=""http://ns.editeur.org/onix/3.0/reference"" xsi:schemaLocation=""http://ns.editeur.org/onix/3.0/reference #someVaraibleWithAbsoluteURLToOnixSchema#"">" />
<Cfset fileToValidate = REReplaceNoCase(fileToValidate, sourceString, replacementString ) />

Thanks to this answer for the Regex for finding an opening tag.

Community
  • 1
  • 1
JeffryHouser
  • 39,401
  • 4
  • 38
  • 59