1

I have the below XML which is getting an error within the productName node. I have run this through the XML Validator and removed parts of the content to try and narrow it down to whats causing the issue but cannot seem to work out what is is. Can anyone assist?

<?xml version="1.0" ?> 
<product>
<prod_id>8236888</prod_id>
<productURL></productURL>
<productImageURL>2mp_hd_sdi.jpg</productImageURL>
<price>349.99</price>
<category>1646290</category>
<qtyInStock>10</qtyInStock>
<offerType>1</offerType>
<offerPrice>299.99</offerPrice>
<VAT>1</VAT>
<discount>0</discount>
<vat_rate></vat_rate>
<productName>SDI Bullet IR Camera Full HD 1080P 1920x10803.3-12mm Varifocal Lens  1/3&amp;rdquo; 2MP CMOS36 IR LEDs  Outdoor / Indoor</productName>
</product>

XML Parsing Error: not well-formed Location: http://www.w3schools.com/xml/xml_validator.asp Line Number 14, Column 35: SDI Bullet IR Camera Full HD 1080P 1920x10803.3-12mm Varifocal Lens 1/3&rdquo; 2MP CMOS36 IR LEDs Outdoor / Indoor ----------------------------------^

LeeTee
  • 6,401
  • 16
  • 79
  • 139
  • 1
    When I pasted your xml to Notepad++, it showed me a US separator at line 14, Coloumn 35. What is the source of your xml ? – Shashank Kadne Sep 18 '13 at 10:16
  • It feels like the `` text is double-encoded. `&` -> `&`, `”` -> `”`. How is the XML generated? Is that something you can control? – Passerby Sep 18 '13 at 10:18
  • There are lots of very similar posts on the web and here on SO. Check out the answer from @PrashantBalan on [this SO post](http://stackoverflow.com/questions/12229572/php-generated-xml-shows-invalid-char-value-27-message) – rwisch45 Sep 18 '13 at 10:18
  • ah yes I can see the US separator. I pull this info from a database where users input their data. What is the US separator and how do I check and remove these in my PHP script that generates the XML? – LeeTee Sep 18 '13 at 10:52

1 Answers1

0

What is the US separator

A "Unit Separator" is an ASCII control code with value 31 (hex 1F). See https://en.wikipedia.org/wiki/Unit_separator#Field_separators

The XML Validator is complaining because it is not a valid character in XML. See https://en.wikipedia.org/wiki/XML#Valid_characters

and how do I check and remove these in my PHP script that generates the XML?

After you've pulled the data from your database, the product name is presumably in a string somewhere (possibly in an array or object). For simplicity, assume it's the value of variable $product_name. To remove US from the string, you could do this:

$product_name = str_replace("\x1F", "", $product_name);

or this:

$product_name = preg_replace( '/\x1F/', '', $product_name);

If other XML-invalid control codes might occur in the product name, you would have to generalize the function call to remove those as well.

(And be sure to check if other database fields might contain control codes.)

Michael Dyck
  • 2,153
  • 1
  • 14
  • 18