I am using xquery processor in web harvest (from java) to parse an html page that contains an invalid tag inside a <div>
element, like <div 3px="abc">
. The exception is:
SXXP0003: Error reported by XML parser: Element type "div" must be followed by either
attribute specifications, ">" or "/>".
at org.webharvest.runtime.processors.XQueryProcessor.execute(Unknown Source)
Is there a quick way to clean the div pre-processing? Or any workaround for this problem?