0

How can I return the element, line, or tag from .xml file which status was invalid.

I'm using this procedure (link below)

Validating xml schema with python

Or you can find the code below :

validator.py

from lxml import etree

class Validator:

    def __init__(self, xsd_path: str):
        xmlschema_doc = etree.parse(xsd_path)
        self.xmlschema = etree.XMLSchema(xmlschema_doc)

    def validate(self, xml_path: str) -> bool:
        xml_doc = etree.parse(xml_path)
        result = self.xmlschema.validate(xml_doc)

        return result

Main.py

import os
from validator import Validator

validator = Validator("path/to/scheme.xsd")

# The directory with XML files
XML_DIR = "path/to/directory"

for file_name in os.listdir(XML_DIR):
    print('{}: '.format(file_name), end='')

    file_path = '{}/{}'.format(XML_DIR, file_name)

    if validator.validate(file_path):
        print('Valid! :)')
    else:
        print('Not valid! :(')

When I run this code, I get this result,

FILE_1.xml: Valid! :) FILE_2.xml: Valid! :) FILE_3.xml: Not valid! :( FILE_4.xml: Valid! :)

My question, I don't have the information regarding which rule was broken, in other words, which line from FILE_3.xml broke a rule in the xsd file. How can I return this info ?

Thank you if you can help

Carlos Carvalho
  • 107
  • 1
  • 3
  • 11

1 Answers1

1

You need to use assertValid and not validate. This will make that when the doc is invalid an exception will be thrown with the data you are looking for.

See https://lxml.de/validation.html (Look for 'If you prefer getting an exception when validating, you can use the assert_ or assertValid methods')

 xml_doc = etree.parse(xml_path)
 try:
     xmlschema.assertValid(xml_doc)
 except Exception as e:
     return False,str(e) 
 return True,'' 
balderman
  • 22,927
  • 7
  • 34
  • 52