0

I have a number of fields that are completed by a user on a form and are then sent to a web service via SOAP. When I build my XML to pass along the user entries, it generally works without issue. However, I'm running into issues in a few cases where the API fails and I know it's related to characters within what the user entered.

Is there a proper way to escape a string to be sent via XML? I've read many threads talk about using htmlspecialchars() and then just as many saying that's bad practice. I also just recently realized that I should probably change the encoding to UTF-8?

I'm sure I will probably get down voted for this post as I admittedly don't have much expertise in XML. Looking for the best practice here so my call to this API is as reliable as possible and sincerely appreciate any guidance.

Here's the XML snippet:

//BUILD FIELD DATA
$xmlStr = "<?xml version=\"1.0\" encoding=\"us-ascii\"?>
<record>
<field Name=\"dateCreated\" Text=\"".$DATE_CREATE."\" />
<field Name=\"purpose\" Text=\"".$PURPOSE."\" />
<field Name=\"comments\" Text=\"".$COMMENTS."\" />
<field Name=\"terms\" Text=\"".$TERMS."\" />
</record>";
Jason
  • 1,105
  • 3
  • 16
  • 30
  • I don't have an answer, I just wanted to say no one should be down-voting you for your lack of knowledge. You wouldn't be here if you knew everything. Votes are supposed to reflect the quality and relevance of the question. I too have seen several string escape questions, [one I asked and answered myself](http://stackoverflow.com/questions/12699037/how-to-display-special-characters-in-php/12784217). I'm not at all an expert, so good luck to you! – Phil Tune Dec 09 '14 at 19:34
  • @philtune Thank you! I sincerely appreciate that! – Jason Dec 09 '14 at 19:36
  • Encoding to UTF-8 is a good idea, especially if you will be receiving input in multiple languages. – Jonathan M Dec 09 '14 at 19:37
  • Where are you validating the input in PHP? – Jonathan M Dec 09 '14 at 19:37

1 Answers1

0

I would recommend you look into using the XMLWriter functions for building XML documents. This PHP manual page on XMLWriter has a nice simple usage example in the user-contributed notes section: http://php.net/manual/en/function.xmlwriter-open-memory.php. If I remember correctly the XMLWriter function will escape some characters for you.

You could also look into wrapping the contents of XML elements with CDATA tags like so:

<field Name="purpose"><![CDATA[ Here's < some " crazy !/> characters! ]]></field>

Which basically tells XML to not parse anything inside of the CDATA tags

You should also be cleaning up any user input you accept to make sure you avoid malicious behavior by the bad guys. I recommend looking into filter_var. It's not bulletproof, but it will definitely help. I use the following in some of my code to clean up inputs to my PHP scripts:

function cleanInput($input){
    if(gettype($input) == 'array'){
        foreach($input as $key => &$val){
            $key = cleanInput($key);
            $val = cleanInput($val);
        }
    }else{
        $input = filter_var(trim($input), FILTER_SANITIZE_SPECIAL_CHARS);
    }
    return $input;
}
$_POST = array_map("cleanInput", $_POST);
$_GET = array_map("cleanInput", $_GET);

Hope that helps you get started.

Glen
  • 889
  • 7
  • 13