1

I am using xquery processor in web harvest (from java) to parse an html page that contains an invalid tag inside a <div> element, like <div 3px="abc">. The exception is:

SXXP0003: Error reported by XML parser: Element type "div" must be followed by either
attribute specifications, ">" or "/>".

at org.webharvest.runtime.processors.XQueryProcessor.execute(Unknown Source)

Is there a quick way to clean the div pre-processing? Or any workaround for this problem?

Mat
  • 202,337
  • 40
  • 393
  • 406

0 Answers0