The program below will output
<div>Average Response Time server is critical because its value 282 > 0 ms. <br/>[Threshold Details : Critical if value > 0, Warning if value = 0, Clear if value < 0]</div>
package test;
import java.io.StringReader;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.apache.xerces.dom.DocumentImpl;
import org.cyberneko.html.parsers.DOMFragmentParser;
import org.w3c.dom.Document;
import org.w3c.dom.DocumentFragment;
import org.xml.sax.InputSource;
public class TestHTMLDOMFragment {
private static final String PARSE_TEXT = "<div>Average Response Time server is critical because its value 282 > 0 ms. <br>[Threshold Details : Critical if value > 0, Warning if value = 0, Clear if value < 0]</div>";
public static void main(String[] argv) throws Exception {
DOMFragmentParser parser = new DOMFragmentParser();
// output the elements in lowercase, nekohtml doesn't do this by default
parser.setProperty("http://cyberneko.org/html/properties/names/elems","lower");
// if this is set to true (the default, you dont need to specifiy this)
// then neko html wont and an html,head and body tags to the response.
parser.setFeature("http://cyberneko.org/html/features/document-fragment",true);
Document document = new DocumentImpl();
DocumentFragment fragment = document.createDocumentFragment();
// parse the document into a fragment
parser.parse(new InputSource(new StringReader(PARSE_TEXT)), fragment);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
// don't display the namespace declaration
transformer.setOutputProperty("omit-xml-declaration", "yes");
DOMSource source = new DOMSource(fragment);
StreamResult result = new StreamResult(System.out);
transformer.transform(source, result);
}
}
The comments in the code above show the parser settings i've used.
I've also used the org.cyberneko.html.parsers.DOMFragmentParser as you may also be parsing text that is just an html fragment
I'm using nekohtml 1.9.14
If you use maven, here's the pom.xml dependencies section...
<dependencies>
<dependency>
<groupId>net.sourceforge.nekohtml</groupId>
<artifactId>nekohtml</artifactId>
<version>1.9.14</version>
<type>jar</type>
</dependency>
</dependencies>