26

I have a php file which prints an xml based on a MySql db.

I get an error every time at exactly the point where there is an & sign.

Here is some php:

$query = mysql_query($sql);

$_xmlrows = '';

while ($row = mysql_fetch_array($query)) {
    $_xmlrows .= xmlrowtemplate($row);
}

function xmlrowtemplate($dbrow){
    return "<AD>
              <CATEGORY>".$dbrow['category']."</CATEGORY>
            </AD>
}

The output is what I want, i.e. the file outputs the correct category, but still gives an error.

The error says: xmlParseEntityRef: no name

And then it points to the exact character which is a & sign.

This complains only if the $dbrow['category'] is something with an & sign in it, for example: "cars & trucks", or "computers & telephones".

Anybody know what the problem is?

BTW: I have the encoding set to UTF-8 in all documents, as well as the xml output.

simhumileco
  • 31,877
  • 16
  • 137
  • 115
  • Please share more details. Also, please explain how this is related to [tag:html], [tag:mysql], or [tag:database] – Nico Haase Oct 06 '21 at 08:56

5 Answers5

55

& in XML starts an entity. As you haven't defined an entity &WhateverIsAfterThat an error is thrown. You should escape it with &amp;.

$string = str_replace('&', '&amp;', $string);

How do I escape ampersands in XML

To escape the other reserved characters:

function xmlEscape($string) {
    return str_replace(array('&', '<', '>', '\'', '"'), array('&amp;', '&lt;', '&gt;', '&apos;', '&quot;'), $string);
}
Community
  • 1
  • 1
NikiC
  • 100,734
  • 37
  • 191
  • 225
  • (Addition, but for poster) So use & to properly escape it -- although instead of (dumb) string interpolation you should use something that *understands* XML (e.g. what happens when the input contains "<"? –  Oct 26 '10 at 18:03
  • 15
    Or, more compact, `htmlspecialchars($string, ENT_QUOTES);` – Wrikken Oct 26 '10 at 22:37
  • 2
    wrapping in <![CDATA tags is the more logical solution – matthy Dec 27 '13 at 17:52
  • 3
    To be sure the string is actually safe, I think it should be done in two stages. For example: $string = 'Foo & Bar'; $string = str_replace('&', '&', $string); // Foo & Bar $string = str_replace('&', '&', $string); // Foo & Bar If there is only one stage, result can be 'Foo &amp; Bar' – sznowicki Oct 22 '14 at 20:43
  • 4
    I would suggest you want to use the `ENT_XML1` option: `htmlspecialchars($string, ENT_XML1);` to ensure that the string is escaped appropriately for XML. – joshweir Nov 07 '15 at 11:17
6

$string =htmlspecialchars($string,ENT_XML1);

is the most universal way to solve all encoding errors (IMHO better that write custom functions + there is no point to solve just &).

Credit: Put Wrikken's and joshweir's comment as answer to be more visible.

pevik
  • 4,523
  • 3
  • 33
  • 44
2

You need to either turn & into its entity &amp;, or wrap the contents in CDATA tags.

If you choose the entity route, there are additional characters you need to turn into entities:

>  &gt;
<  &lt;
'  &apos;
"  &quot;

Background: Beware of the ampersand when using XML

Wikipedia: List of XML character entity references

Community
  • 1
  • 1
Pekka
  • 442,112
  • 142
  • 972
  • 1,088
0

Switch and regex with using xml escape function.

 function XmlEscape(str) {
    if (!str || str.constructor !== String) {
        return "";
    }

    return str.replace(/[\"&><]/g, function (match) {
        switch (match) {
        case "\"":
            return "&quot;";
        case "&":
            return "&amp;";
        case "<":
            return "&lt;";
        case ">":
            return "&gt;";
        }
    });
};
Huseyin Durmus
  • 380
  • 6
  • 14
-1
public function sanitize(string $data) {
    return str_replace('&', '&amp;', $data);
}

You are right: here is more context - the example is in relation to the ' how to deal with data containing '&' when we pass this data to SimpleXml. Of course there is also other solution to use <![CDATA[some stuff]]>

Denise Ignatova
  • 465
  • 4
  • 7