0

I have this String:

<dependencies style="typed">
  <dep type="dep">
    <governor idx="1">Maria</governor>
    <dependent idx="2">mrge</dependent>
  </dep>
  <dep type="dep">
    <governor idx="2">mrge</governor>
    <dependent idx="3">la</dependent>
  </dep>
  <dep type="dep">
    <governor idx="1">Maria</governor>
    <dependent idx="4">scoala</dependent>
  </dep>
</dependencies>

And I tried to pass it but an exception appears like this and I don't know how I can solve it.

This is the error:

3:1: Content is not allowed in prolog.
org.xml.sax.SAXParseException; lineNumber: 3; columnNumber: 1; Content is not allowed in prolog.
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
    at versionTwo.Analyze.convertStringToDocument(Analyze.java:348)
    at versionTwo.Analyze.depRel(Analyze.java:299)
    at versionTwo.MainClass.main(MainClass.java:17)
Exception in thread "main" java.lang.NullPointerException
    at versionTwo.Analyze.depRel(Analyze.java:300)

And this is my code:

    public String depRel(String graph) throws SAXException, IOException,
            ParserConfigurationException {
        String xmlString;
        xmlString = Features.dependencyGraph(graph);
        String result = "";
        System.out.println("A value og dependency graph is;" + xmlString);
        Document document = parseXmlFromString(xmlString);
        document.getDocumentElement().normalize();
        Element root = document.getDocumentElement();
        NodeList nList = document.getElementsByTagName("dependencies");
        for (int temp = 0; temp < nList.getLength(); temp++) {
            Node node = nList.item(temp);
            if (node.getNodeType() == Node.ELEMENT_NODE) {
                // Print each employee's detail
                Element eElement1 = (Element) node;
            }
            NodeList nodesDocPart = node.getChildNodes();
            for (int temp2 = 0; temp2 < nodesDocPart.getLength(); temp2++) {
                Node n = nodesDocPart.item(temp2);
                // /////////////////////////////////////////////////sentence/////////////////////////////////////////////
                NodeList nodesSentencePart = n.getChildNodes();
                for (int temp3 = 0; temp3 < nodesSentencePart.getLength(); temp3++) {
                    Node sentence = nodesSentencePart.item(temp3);
                    if (sentence.getNodeType() == Node.ELEMENT_NODE) {
                        Element eElement4 = (Element) sentence;
                        System.out.println("Sentence : "
                                + eElement4.getTextContent());
                        result = eElement4.getTextContent() + "\n";
                    }
                }
            }
        }
        return result;
    }

    public Document parseXmlFromString(String xmlString)
            throws ParserConfigurationException, SAXException, IOException {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        InputStream inputStream = new ByteArrayInputStream(xmlString.getBytes());
        org.w3c.dom.Document document = builder.parse(inputStream);
        return document;
    }

And this is my method that create a String from an XML , after parsing a sentence.This String I want to read in another class like an xml but the error that I posted bottom appear.Any idea?

public static String dependencyGraph(String s) {
    Properties props = new Properties();
    props.put("annotators",
            "tokenize, ssplit, pos, lemma, ner, parse, dcoref,depparse");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
    Annotation document = new Annotation(s);
    pipeline.annotate(document);
    CoreMap sentence = document.get(
            CoreAnnotations.SentencesAnnotation.class).get(0);
    SemanticGraph dependency_graph = sentence
            .get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);

    String newLine = System.getProperty("line.separator");
    //convert the output format to a string

    String graph = "\n\nDependency Graph: "
            + dependency_graph.toString(SemanticGraph.OutputFormat.XML)//save the answer like a String from the xml
            + newLine;
    // System.out.println("The graph was made=>" + graph);
    return graph;

}

public static String dependencyGraph(String s) {
    Properties props = new Properties();
    props.put("annotators",
            "tokenize, ssplit, pos, lemma, ner, parse, dcoref,depparse");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
    Annotation document = new Annotation(s);
    pipeline.annotate(document);
    CoreMap sentence = document.get(
            CoreAnnotations.SentencesAnnotation.class).get(0);
    SemanticGraph dependency_graph = sentence
            .get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);

    String newLine = System.getProperty("line.separator");
    //convert the output format to a string

    String graph = "\n\nDependency Graph: "
            + dependency_graph.toString(SemanticGraph.OutputFormat.XML)//save the answer like a String from the xml
            + newLine;
    // System.out.println("The graph was made=>" + graph);
    return graph;

}
Draken
  • 3,134
  • 13
  • 34
  • 54
Nadd
  • 126
  • 1
  • 8
  • Possible duplicate of [Content is not allowed in Prolog SAXParserException](https://stackoverflow.com/questions/4569123/content-is-not-allowed-in-prolog-saxparserexception) – tnw Jun 20 '17 at 14:54

1 Answers1

0

In dependencyGraph(String) you do

String graph = "\n\nDependency Graph: "
           + dependency_graph.toString(SemanticGraph.OutputFormat.XML);

which creates a string that starts with two newlines and the text "DependencyGraph".

You then assign this to a variable:

String xmlString;
        xmlString = Features.dependencyGraph(graph);

and then try to parse it as XML:

Document document = parseXmlFromString(xmlString);

But a string that starts with two newlines and the text "Dependency Graph" is not well-formed XML, so the XML parser complains: at line 3 column 1 it has found something that can't be the prolog of an XML document.

So the answer to your headline question is: if you want to parse a string as XML, it must contain well-formed XML.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Thank you very much..Indeed this was the problem..No it works – Nadd Jun 21 '17 at 18:37
  • The StackOverflow convention is to say thanks (and tell the world that the answer is correct) by marking the answer as accepted: click the tick-mark that appears next to the answer. – Michael Kay Jun 21 '17 at 23:21