I'm trying to create a simple java converter from html to md in java, found the answer html to md however it seems to be quite outdated and no longer works, bc of the below stack trace, is there any chance to convert html to md in 2018 with any of the jvm based languages?
Both of the files (html, xsl) are properly formatted as UTF-8 and don't contain any fancy characters
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
here is the code i'm tuning
public static void main(String[] args) throws TransformerException {
final String md = convert(htmlLocation);
}
public static String convert(final String htmlLocation) throws TransformerException {
if (html == null) {
return "";
}
final File xslFile = new File(xslLocation);
final Source htmlSource = new StreamSource(new StringReader(htmlLocation));
final Source xslSource = new StreamSource(xslFile);
final TransformerFactory transformerFactory = TransformerFactory.newInstance();
final Transformer transformer = transformerFactory.newTransformer(xslSource);
final StringWriter result = new StringWriter();
transformer.transform(htmlSource, new StreamResult(result));
return result.toString();
}
content of html
<html>
<h1>Lorem ipsum dolor</h1>
<h2>Lorem ipsum dolor</h2>
<p>Lorem ipsum dolor</p>
</html>
for anyone who is straggling with the same issue please refer to the project that does the conversion without xslt