1

So I've got a program that is reading in large XML files, which contain multiple entries of data. So the database I'm using it for originally contained 40,000 separate entries written in XML file, but you can download one XML file that contains all the entries. However, because of this, the XML declaration element:-

<?xml version="1.0" encoding="UTF-8"?>

is called multiple times throughout the document, and I was wondering whether there was some way of dealing with this through the use of StAX parser.

Edit: should of said that I can't properly parse through my document and read everything as it keeps returning the error:-

Exception in thread "main" javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1062,6]
Message: The processing instruction target matching "[xX][mM][lL]" is not allowed.

because of the fact that the xml declaration is stated multiple times. Thanks

user2062207
  • 955
  • 4
  • 18
  • 34
  • 2
    http://stackoverflow.com/questions/19889132/xslthe-processing-instruction-target-matching-xxmmll-is-not-allowed – Stefan Nov 11 '14 at 10:20
  • Just to clarify though, I made a method that iterates through the document and tries to remove all elements that contains – user2062207 Nov 11 '14 at 12:43
  • You can't use XMLStreamReader on data that is not well-formed XML without getting an exception. See [below](http://stackoverflow.com/a/26870941/290085). – kjhughes Apr 01 '15 at 12:35

2 Answers2

2

Until you eliminate the spurious <?xml ?> declaration(s), you cannot treat the file as XML because it is not well-formed. First treat it as text, either manually or programmatically, to eliminate the extra XML declarations before trying to parse it as XML.

For general information on all the ways the

The processing instruction target matching "[xX][mM][lL]" is not allowed.

error arises and remedies for addressing each way, see this answer (as suggested by Stefan).

Community
  • 1
  • 1
kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • ` ` these are first line & still getting error – Amit Panasara Aug 22 '19 at 07:44
  • 1
    @AmitPanasara: The XML declaration may only appear once and ***only at the very top*** of an XML document. Remove the comment that precedes the XML declaration to fix your problem. – kjhughes Aug 22 '19 at 11:40
0

This line is called the XML prolog:

<?xml version="1.0" encoding="UTF-8"?>

The XML prolog is optional. If it exists, it must come first in the document.

It should not repeated anywhere else in the document.

Source : XMLProlog-W3Scools

Amit Panasara
  • 600
  • 8
  • 16