18

I know XML documents usually start with something like:

<?xml version="1.0" encoding="UTF-8"?>

My question is regarding the <? and ?> what do they mean on their own? As in what does:

<?Any old text?>

mean in XML

Thanks

More Than Five
  • 9,959
  • 21
  • 77
  • 127

2 Answers2

17

The <? sequence starts a so-called processing instruction. These are bits added for third-party software to process. Famously, <?php is technically a PI, but also things like XSLT stylesheet embedding:

<?xml-stylesheet href="classic.xsl" type="text/xml"?>
Boldewyn
  • 81,211
  • 44
  • 156
  • 212
13

It's a prolog – see 'The XML Prolog' in XML Syntax Rules at w3schools.com.

For more thorough definition see '2.8 Prolog and Document Type Declaration' in 'Extensible Markup Language (XML)' at W3C.

Note that although the XML Prolog (<?xml ...?>) looks like a special kind of a Processing instruction (<?...?>), it is a separate thing to some degree.

The '2.1 Well-Formed XML Documents' section of the W3C document describes a well-formed XML document as having exactly one prolog, and the definition in section 2.8 says that the prolog can contain at most one XMLDecl, which is <?xml ....?>, and an arbitrary number of PIs.

What's more, the XMLDecl, if present, must appear as the first part of prolog, while PIs may follow, in no pre-defined order, possibly separated with comments and space characters.

Additionally, the '2.6 Processing Instructions' definition explicitly excludes xml (with all possible combinations of caps) from allowed names of PI.

CiaPan
  • 9,381
  • 2
  • 21
  • 35
  • Can you have multiple prolog statements in an XML document, ever? For example, one to state the version and encoding and another to just pass some information to the engine? – More Than Five Aug 24 '16 at 09:24
  • 2
    @MoreThanFive Yes & no. Yes, you can have multiple Processing instructions (`...?>`), but No, you can have only one Prolog (``). The ['2.1 Well-Formed XML Documents'](http://www.w3.org/TR/REC-xml/#sec-well-formed) section of the w3c document describes a well-formed XML document as having exactly one prolog, and the prolog definition says it can contain at most one XMLDecl, which is ``. – CiaPan Aug 24 '16 at 09:33
  • 1
    @MoreThanFive (contd.) ...and the ['2.6 Processing Instructions'](http://www.w3.org/TR/REC-xml/#sec-pi) definition explicitly excludes `xml` from allowed `name`s of PI. – CiaPan Aug 24 '16 at 09:40