0

Im parsing XML documents with java. Every document has root tag (it is a string) and a number of tags with text(unknown number) for example(check code in codebox). <AnyStrYouwant> tags have a string of characters in its body.

<anyRoot>
    <AnyStrYouwant1>anyTextYouWant1</AnyStrYouwant1>
    <AnyStrYouwant2>anyTextYouWant2</AnyStrYouwant2>
    ...
</anyRoot>

How programically(in java) chek if some file suits this structure. I can parse XML, I know that there is DTD(for example) that can check XML file with known format (tag names and content). What shall I use in this case?

PS: some people advice me to use XSD. But if I want to validate elements I need to know root element name. I dont know root element name (every file has own root element).

BrettWatts
  • 87
  • 9
  • possible duplicate of [What's the best way to validate an XML file against an XSD file?](http://stackoverflow.com/questions/15732/whats-the-best-way-to-validate-an-xml-file-against-an-xsd-file) –  Jul 23 '14 at 08:41
  • One thing I was thinking over was a xsd based validation, using some regex but that seems NOT possible in your case.. please refer to an existing question http://stackoverflow.com/questions/12929550/xsd-from-variable-number-of-xml-elements – AurA Jul 23 '14 at 08:43
  • ok. I dont know root element name. What shall I do? – BrettWatts Jul 23 '14 at 09:32

3 Answers3

1

I cant comment with my new account but yes you can use DTD, Schematron Schematron is much more flexible and it is industry standart where DTD is really a legacy technology but still widely used. DTD will check for allowed tags (in short) where Schematron is able to check the structure of the file for example that some special tags should be in first 10 lines of XML etc.

I would use DTD if you are only checking for existing tags and attributes allowed values. If you do something more complex I would recommend using Schematron with its rules based validation.

  • I wouldn't call DTD legacy, it just does a different job. DTD checks that an XML file is valid with respect to a specific grammar, and that's all. Schematron is a higher-level tool for spotting patterns within valid XML files. A DTD will typically apply to every element in an XML doc, whereas Schematron is more likely to only to apply to specific elements. You can use Schematron to validate against a grammar, but you'd end up with your Schematron looking like a DTD... – Synchro Jul 23 '14 at 09:14
  • @Synchro yes you are totally right but nothing stop you from mixing this two approaches to check grammar and structure using this 2 technologies. It all depends on what you really want to achieve – Szymon Krawczyk Jul 23 '14 at 09:17
  • @亚历山大 This is not even a question he has no defined DTD schema so my answer is perfectly fine. I gave him my opinion and two possible approaches to his problem – Szymon Krawczyk Jul 23 '14 at 09:19
0

You can use DTD or XSD to validate XML, take a look at :

http://www.w3schools.com/xml/xml_dtd.asp

http://www.journaldev.com/895/how-to-validate-xml-against-xsd-in-java

XSD is the advanced technique to validate XML, it's more flexible than DTD but you can use one of those technologies to solve your problem.

mkazma
  • 572
  • 3
  • 11
  • 29
0

You can check XML with XSD using this sample code.

import javax.xml.transform.sax.SAXSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.SchemaFactory;
import org.xml.sax.InputSource;

public boolean isValidXML(InputStream is) {
    InputSource isrc;
    try {
        isrc = new InputSource(new FileInputStream("path/your-xsd-file.xsd")));
        SAXSource sourceXSD = new SAXSource(isrc);
        SchemaFactory
                .newInstance("http://www.w3.org/2001/XMLSchema")
                .newSchema(sourceXSD).newValidator()
                .validate(new StreamSource(is));
    } catch (Exception e) {
        return false;
    }
    return true;
}
Athanor
  • 855
  • 1
  • 16
  • 34