3

I've been looking everywhere for a detailed explanation on how the XMLWriter() encodes its output but couldn't find it. I would like to know what encoding should the input data be in if I want an specific output encoding, for example ISO-8859-1. Should I give it the input data in the same format?

For example here:

$xw->writeElement('garantie','Garantie à vie'); *edited
$xw->endElement();

Should I do any encoding conversion on the string 'Garantie à vie' or does the XMLWriter() convert it automatically? Should the string be in ISO-8859-1 or UTF-8?

hakre
  • 193,403
  • 52
  • 435
  • 836
Pato
  • 43
  • 1
  • 4

1 Answers1

3

Should I do any encoding conversion on the string 'Garantie à vie' or does the XMLWriter() convert it automatically?

XMLWriter accepts UTF-8 string input in PHP and it will automatically re-encode it into the output encoding (if needed). This internal re-encoding is not always needed because an XML's default default encoding is UTF-8 already.

Should the string be in ISO-8859-1 or UTF-8?

The string should be UTF-8 encoded.

Example (with an UTF-8 encoded string; Demo):

<?php
/**
 * About PHP XMLwriter() encoding input and output
 *
 * @link https://stackoverflow.com/a/19046825/367456
 * @link https://eval.in/51120
 */

$xmlWriter = new XMLWriter();
$xmlWriter->openMemory();
$xmlWriter->startDocument('1.0', 'US-ASCII');
$xmlWriter->writeElement('garantie', 'Garantie à vie');
$xmlWriter->endDocument();
echo $xmlWriter->flush();

Output:

<?xml version="1.0" encoding="US-ASCII"?>
<garantie>Garantie &#224; vie</garantie>

See as well:

Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836
  • Sorry, it should say: $xw->writeElement('garantie','Garantie à vie'); So, if I understand correctly, it is enough to make sure the text 'Garantie à vie' is in UTF-8 and then the XMLWriter output will be in the encoding I set? For example: $xw->startDocument('1.0','ISO-8859-1'); – Pato Sep 27 '13 at 09:12
  • @Pato: Yes, exactly that way. The second parameter containing `'ISO-8859-1'` can be upper or lower case (it's common to use upper-case). The character-encoding-names which can be used depend on your PHP configuration, `ISO-8859-1` should not pose any problems. – hakre Sep 27 '13 at 09:17
  • And while commenting that: The demo site I linked *does not support ISO-8859-1* - don't ask me why :D Never experienced that before, but as written, this depends on PHP configuration. – hakre Sep 27 '13 at 09:23