65

i was trying to check the validity of a string as xml using this simplexml_load_string()Docs function but it displays a lot of warning messages.

How can I check whether a string is a valid XML without suppressing (@ at the beginning) the error and displaying a warning function that expec

hakre
  • 193,403
  • 52
  • 435
  • 836
jspeshu
  • 1,211
  • 3
  • 12
  • 20

7 Answers7

93

Use libxml_use_internal_errors() to suppress all XML errors, and libxml_get_errors() to iterate over them afterwards.

Simple XML loading string

libxml_use_internal_errors(true);

$doc = simplexml_load_string($xmlstr);
$xml = explode("\n", $xmlstr);

if (!$doc) {
    $errors = libxml_get_errors();

    foreach ($errors as $error) {
        echo display_xml_error($error, $xml);
    }

    libxml_clear_errors();
}
onkar
  • 4,427
  • 10
  • 52
  • 89
Tjirp
  • 2,435
  • 1
  • 25
  • 35
  • 7
    Close to perfect...would just add that the display_xml_error function is simply a custom function to output the errors in a nice way, it can be found here http://php.net/manual/en/function.libxml-get-errors.php. At first I thought it was an internal function that I was missing. – Carlton Jul 25 '14 at 15:09
  • 10
    Be careful with `if (!$doc)`! PHP considers for example a namespaced document as empty and therefore `!$doc === TRUE`. – David Jan 14 '16 at 14:54
  • 6
    I ran into the problem as @David mentioned, I had to explicitly check for `if($doc !== FALSE)` instead of just `if($doc)` which would normally be enough. – Samsquanch May 18 '16 at 15:49
28

From the documentation:

Dealing with XML errors when loading documents is a very simple task. Using the libxml functionality it is possible to suppress all XML errors when loading the document and then iterate over the errors.

The libXMLError object, returned by libxml_get_errors(), contains several properties including the message, line and column (position) of the error.

libxml_use_internal_errors(true);
$sxe = simplexml_load_string("<?xml version='1.0'><broken><xml></broken>");
if (!$sxe) {
    echo "Failed loading XML\n";
    foreach(libxml_get_errors() as $error) {
        echo "\t", $error->message;
    }
}

Reference: libxml_use_internal_errors

Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
14

try this one

//check if xml is valid document
public function _isValidXML($xml) {
    $doc = @simplexml_load_string($xml);
    if ($doc) {
        return true; //this is valid
    } else {
        return false; //this is not valid
    }
}
Jkhaled
  • 435
  • 5
  • 7
13

My version like this:

//validate only XML. HTML will be ignored.

function isValidXml($content)
{
    $content = trim($content);
    if (empty($content)) {
        return false;
    }
    //html go to hell!
    if (stripos($content, '<!DOCTYPE html>') !== false) {
        return false;
    }

    libxml_use_internal_errors(true);
    simplexml_load_string($content);
    $errors = libxml_get_errors();          
    libxml_clear_errors();  

    return empty($errors);
}

Tests:

//false
var_dump(isValidXml('<!DOCTYPE html><html><body></body></html>'));
//true
var_dump(isValidXml('<?xml version="1.0" standalone="yes"?><root></root>'));
//false
var_dump(isValidXml(null));
//false
var_dump(isValidXml(1));
//false
var_dump(isValidXml(false));
//false
var_dump(isValidXml('asdasds'));
admin
  • 167
  • 2
  • 5
3

Here a small piece of class I wrote a while ago:

/**
 * Class XmlParser
 * @author Francesco Casula <fra.casula@gmail.com>
 */
class XmlParser
{
    /**
     * @param string $xmlFilename Path to the XML file
     * @param string $version 1.0
     * @param string $encoding utf-8
     * @return bool
     */
    public function isXMLFileValid($xmlFilename, $version = '1.0', $encoding = 'utf-8')
    {
        $xmlContent = file_get_contents($xmlFilename);
        return $this->isXMLContentValid($xmlContent, $version, $encoding);
    }

    /**
     * @param string $xmlContent A well-formed XML string
     * @param string $version 1.0
     * @param string $encoding utf-8
     * @return bool
     */
    public function isXMLContentValid($xmlContent, $version = '1.0', $encoding = 'utf-8')
    {
        if (trim($xmlContent) == '') {
            return false;
        }

        libxml_use_internal_errors(true);

        $doc = new DOMDocument($version, $encoding);
        $doc->loadXML($xmlContent);

        $errors = libxml_get_errors();
        libxml_clear_errors();

        return empty($errors);
    }
}

It works fine with streams and vfsStream as well for testing purposes.

Francesco Casula
  • 26,184
  • 15
  • 132
  • 131
2

Case

Occasionally check availability of a Google Merchant XML feed.

The feed is without DTD, so validate() won't work.

Solution

// disable forwarding those load() errors to PHP
libxml_use_internal_errors(true);
// initiate the DOMDocument and attempt to load the XML file
$dom = new \DOMDocument;
$dom->load($path_to_xml_file);
// check if the file contents are what we're expecting them to be
// `item` here is for Google Merchant, replace with what you expect
$success = $dom->getElementsByTagName('item')->length > 0;
// alternatively, just check if the file was loaded successfully
$success = null !== $dom->actualEncoding;

length above contains a number of how many products are actually listed in the file. You can use your tag names instead.

Logic

You can call getElementsByTagName() on any other tag names (item I used is for Google Merchant, your case may vary), or read other properties on the $dom object itself. The logic stays the same: instead of checking if there were errors when loading the file, I believe actually trying to manipulate it (or specifically check if it contains the values you actually need) would be more reliable.

Most important: unlike validate(), this won't require your XML to have a DTD.

ᴍᴇʜᴏᴠ
  • 4,804
  • 4
  • 44
  • 57
0

Solution

<?php
/**
* 檢查XML是否正確
* 
* @param string $xmlstr
* @return bool
*/
public function checkXML($xmlstr)
{
    libxml_use_internal_errors(true);
    $doc = simplexml_load_string($xmlstr);
    if (!$doc) {
        $errors = libxml_get_errors();
        if (count($errors)) {
            libxml_clear_errors();
            return false;
        }
    }
    return true;
}
kkasp
  • 113
  • 1
  • 9