54

I am writing a program in Java that takes a custom XML file and parses it. I'm using the XML file for storage. I am getting the following error in Eclipse.

[Fatal Error] :1:1: Content is not allowed in prolog.
org.xml.sax.SAXParseException: Content is not allowed in prolog.
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:239)
    at     com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283  )
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:208)
    at me.ericso.psusoc.RequirementSatisfier.parseXML(RequirementSatisfier.java:61)
    at me.ericso.psusoc.RequirementSatisfier.getCourses(RequirementSatisfier.java:35)
    at     me.ericso.psusoc.programs.RequirementSatisfierProgram.main(RequirementSatisfierProgram.java:23  )

The beginning of the XML file is included:

<?xml version="1.0" ?>
<PSU>
     <Major id="IST">
        <name>Information Science and Technology</name>
        <degree>B.S.</degree>
        <option> Information Systems: Design and Development Option</option>
        <requirements>
            <firstlevel type="General_Education" credits="45">
                <component type="Writing_Speaking">GWS</component>
                <component type="Quantification">GQ</component>

The program is able to read in the XML file but when I call DocumentBuilder.parse(XMLFile) to get a parsed org.w3c.dom.Document, I get the error above.

It doesn't seem to me that I have invalid content in the prolog of my XML file. I can't figure out what is wrong. Please help. Thanks.

Marcus Adams
  • 53,009
  • 9
  • 91
  • 143
ericso
  • 3,218
  • 7
  • 29
  • 36
  • 2
    I found my error. I was reading in the folder the file was in and not the file itself. Apparently if you read in a folder as a file and call File.exists() on it, it will still return true. Stupid me... Thanks for all the help. – ericso Apr 08 '10 at 22:43
  • check my answer at http://stackoverflow.com/questions/3665554/about-saxparseexception-content-is-not-allowed-in-prolog/7023984 or just check this link http://mark.koli.ch/2009/02/resolving-orgxmlsaxsaxparseexception-content-is-not-allowed-in-prolog.html – Starfish Aug 11 '11 at 10:03
  • don't know if it will help anyone but I got this error trying to use flavorDimensions and putting drawable-xhdpi under res in my flavours. Once I changed it to drawable.. all fixed – dangalg Feb 13 '15 at 14:09

8 Answers8

22

Please check the xml file whether it has any junk character like this �.If exists,please use the following syntax to remove that.

String XString = writer.toString();
XString = XString.replaceAll("[^\\x20-\\x7e]", "");
DaveShaw
  • 52,123
  • 16
  • 112
  • 141
Gopal
  • 221
  • 2
  • 2
  • 3
    I found this really simple technique to be pretty useful as a quick fix. In order to keep newlines, though, you might prefer the regex `replaceAll("[^\\x20-\\x7e\\x0A]", "");` – Patrick Jan 11 '13 at 20:10
  • 3
    Attention: This will remove any Unicode characters and mostly not what people want. – Michael Nov 29 '18 at 11:15
10

I think this is also a solution of this problem.

Change your document type from 'Encode in UTF-8' To 'Encode in UTF-8 without BOM'

I got resolved my problem by doing same changes.

Java_Alert
  • 1,159
  • 6
  • 24
  • 50
7

Make sure there's no hidden whitespace at the start of your XML file. Also maybe include encoding="UTF-8" (or 16? No clue) in the node.

Ben J
  • 5,811
  • 2
  • 28
  • 32
3

The document looks fine to me but I suspect that it contains invisible characters. Open it in a hex editor to check that there really isn't anything before the very first "<". Make sure the spaces in the XML header are spaces. Maybe delete the space before "?>". Check which line breaks are used.

Make sure the document is proper UTF-8. Some windows editors save the document as UTF-16 (i.e. every second byte is 0).

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
  • I've been editing the XML file in Eclipse text editor. I'm on a Mac and I also use BBEdit. I'll check for invisible characters. – ericso Apr 08 '10 at 13:00
  • I checked for invisible characters in BBEdit (View > Text Display > Show Invisibles) and I don't see any invisible characters in the XML declaration. I also deleted the whitespace at the end of the declaration. I added encoding="UTF-8" and encoding="UTF-16 and I'm still getting the error. – ericso Apr 08 '10 at 13:08
  • What is the encoding of the file? i.e. not what you think but what does your editor say? – Aaron Digulla Apr 08 '10 at 13:53
  • Also make sure that you're actually looking at the file which causes the error! – Aaron Digulla Apr 08 '10 at 13:54
  • I checked the encoding type in BBEdit; it is UTF-16. I'm pretty sure I'm looking at the right file. The following is my code for reading in the file and parsing it: File f = new File("/Users/thechiman/Dropbox/introcs/PSU SOC Crawler/src/resources"); //Check to see if file exists if(f.exists()) { System.out.println("file exists"); } else { System.out.println("file does not exist"); } //Use factory to get a new DocumentBuilder DocumentBuilder db = dbf.newDocumentBuilder(); //Parse the XML file, get DOM representation this.dom = db.parse(f); – ericso Apr 08 '10 at 14:59
  • Well, the parser expects UTF-8 and your file is UTF-16. This means the first byte of the file is 0 and you get the error. Save the file with the correct encoding (UTF-8) to fix the problem. – Aaron Digulla Apr 08 '10 at 15:21
  • I saved the file as UTF-8 and UTF-8, No BOM. Both times I get the same error. – ericso Apr 08 '10 at 15:54
  • In that case, you're editing a different file than the parser reads. – Aaron Digulla Apr 09 '10 at 07:18
2

You are not providing the correct address for the file. You need to provide an address such as C:/Users/xyz/Desktop/myfile.xml

Rob Kielty
  • 7,958
  • 8
  • 39
  • 51
Amit Agarwal
  • 191
  • 1
  • 3
  • 11
1

I assume you have proper xml encoding and matching with Schema.

If you still get this error, check code that unmarshalls the xml and input type you have used. Because XML documents declare their own encoding, it is preferable to create a StreamSource object from an InputStream instead of from a Reader, so that XML processor can correctly handle the declared encoding [Ref Book: Java in A Nutshell ]

Hope this helps!

spark07
  • 168
  • 2
  • 10
0

If you're able to control the xml file, try adding a bit more information to the beginning of the file:

<?xml version="1.0" encoding="UTF-16" standalone="no"?>
Drew Johnson
  • 18,973
  • 9
  • 32
  • 35
  • I've added both standalone="no" and standalone="yes". Both give me the same error. – ericso Apr 08 '10 at 16:37
  • 3
    hmmm...the next thing I'd try is brute force - try to get a dummy document through the parser, then slowly add parts of your original document until you can identify the problem. I've been down that road before :-) – Drew Johnson Apr 08 '10 at 16:55
0

Check any syntax problem in the XMl file. I've found this error when working on xsl/xsp with Cocoon and I define a variable using a non-existing node or something like that. Check the whole XML.

Alfabravo
  • 7,493
  • 6
  • 46
  • 82
  • I get the error before I can do anything with the parsed document. It's failing when I call DocumentBuilder.parse(XMLFile). I ran the XML file through an XML validator (xmlvalidation.com) and it went through just fine. – ericso Apr 08 '10 at 16:42
  • Is the file available in the specified location? Maybe your program can't access the content of the file and the parser just says what it founds is not xml valid... just guessing. – Alfabravo Apr 08 '10 at 21:17
  • @Alfabravo slightly different question, do you know if i get a parsing error how can i catch this exception. The document builder in java, does not throw exception rather print on the error stream, so how can i notify the user if corrupt file was provided?? – Space Rocker Apr 14 '13 at 14:10