2

I have to validate an XML document so it will not accept an invalid XML document.

I did it this way to handle an invalid document:

xml ||= Nokogiri::XML xml_data do |config| 
  config.strict
end
rescue Nokogiri::XML::SyntaxError => e
  puts "caught exception: #{e}"
else
  #further processing if no error

But even for the valid XML document, it shows:

caught exception: Extra content at the end of the document

Sample XML i'm using:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

What am I doing wrong?

Ahmad hamza
  • 1,816
  • 1
  • 23
  • 46
  • I can't reproduce your error, for me it raises no errors. Please check the error definition and correct your xml. Check this answer: http://stackoverflow.com/a/16972780/644810 – Rustam Gasanov Jan 04 '16 at 15:23
  • @mudasobwa actually whitespaces don't trigger this error, I've just checked this case, more likely he has multiple tags being opened and closed on the top level, while it should be only one root element. – Rustam Gasanov Jan 04 '16 at 15:27
  • @RustamA.Gasanov Indeed. – Aleksei Matiushkin Jan 04 '16 at 15:33
  • @ahmadhamza still not. Try this code: http://pastie.org/10669579 – Rustam Gasanov Jan 04 '16 at 15:44
  • 1
    Please don't use pastie or offsite repositories to store the necessary XML or code. See "[ask]" and "[mcve]". Put the *MINIMAL* XML necessary to demonstrate the problem *in the question itself*. When the links rot and break the question will not be able to help people looking for similar solutions in the future. Also, your Ruby is invalid. It should be syntactically correct. – the Tin Man Jan 04 '16 at 17:21

1 Answers1

10

If you want to see whether a document is invalid XML, simply check the errors method of the returned document:

require 'nokogiri'

doc = Nokogiri::XML('<xml><foo></xml>')
doc.errors
# => [#<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: foo line 1 and xml>,
#     #<Nokogiri::XML::SyntaxError: Premature end of data in tag xml line 1>]

If Nokogiri finds any errors it'll populate the errors array.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303