16

How to verify whether a XML file is valid in sh (preferably) or bash?

I have a file that often get corrupted and needs to be replaced while I take my time to investigate the underlying issue.

Is there any easy way of performing this task with sh or bash?

codeforester
  • 39,467
  • 16
  • 112
  • 140
user2799603
  • 873
  • 3
  • 13
  • 19

4 Answers4

27

Not directly with bash, but xmllint is fairly widely available.

xmllint --format "${xmlfile}"

This will exit with non-zero status (hint: $? in bash gets you the exit code of the last command) if the XML file is invalid.

Yuri
  • 4,254
  • 1
  • 29
  • 46
sanmiguel
  • 4,580
  • 1
  • 30
  • 27
  • 1
    Needs more quotes to be safe with filenames containing whitespace or glob characters. – Charles Duffy Feb 28 '14 at 18:26
  • 2
    That's actually exactly why I stopped using zsh years ago -- it resulted in too many bugs whenever I wrote code for POSIX-superset shells. – Charles Duffy Feb 28 '14 at 18:44
  • 1
    `xmllint --format "${xmlfile}" > /dev/null` may be more convenient. If it is invalid, the command will print info with the stderr. – Jason Pan Jun 24 '22 at 07:48
6

XMLStarlet has a validate subcommand. At its simplest, to check for well-formedness:

xmlstarlet val "$filename"

To validate against a DTD:

xmlstarlet val -d "$dtd_filename" "$xml_filename"

To validate against an XSD schema:

xmlstarlet val -s "$xsd_filename" "$xml_filename"

To validate against a RelaxNG schema:

xmlstarlet val -r "$rng_filename" "$xml_filename"

This isn't built into bash -- bash has no built-in XML parser, and validation cannot be performed without one -- but it is widely packaged for modern OS distributions.


XMLStarlet also has subcommands for extracting information from XML files, editing XML files, etc. If you're going to be working with XML from shell scripts, its use is well-advised.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
3

If you want to validate against a RelaxNG schema, which is an alternative grammar to W3C XML schema, you can use Libxml2 (xmllint) but it only supports the RelaxNG XML syntax.

To validate an XML file with Libxml2 against a RelaxNG schema

xmllint --noout --relaxng schema.rng file.xml

It is possible to convert a RelaxNG schema from compact syntax to XML syntaxt with trang. But you may also use Jing to To validate against a RelaxNG XML schema.

With jing installed on your computer, you can validate a file file.xml against a schema schema.relaxng like this :

jing schema.rng file.xml

To use the RelaxNG compact syntax :

jing -c schema.rnc file.xml
emchateau
  • 65
  • 7
-1

Most parsers come with sample programs that can be run from the command line. Run one of those which validates the document.

There are many good tools. As someone who has implemented several of them, but who cares more about the language than the specific tool you use, I decline to recommend one over another. If you insist on an answer to the question as posed, it's "you can't do that in sh or bash per se... at least not unless you are enough of a masochist to try to write it from the ground up, and then performance will be awful."

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
keshlam
  • 7,931
  • 2
  • 19
  • 33