-3

Possible Duplicate:
format xml string

I'm generating an XML page like so:

            header('Content-Type: text/html');              

            $xmlpage = '<?xml version="1.0" charset="utf-8"?>';

            $xmlpage .= '<conv>';
            $xmlpage .= '<at>6 January 2012 12:00</at>';
            $xmlpage .= '<rate>1.56317</rate>';

            $xmlpage .= '<from>';
            $xmlpage .= '<code>'.$from.'</code>';
            $xmlpage .= '<curr>Pound Sterling</curr>';
            $xmlpage .= '<loc>UK</loc>';
            $xmlpage .= '<amnt>'.$amnt.'</amnt>';
            $xmlpage .= '</from>';

            $xmlpage .= '</conv>';

            echo $xmlpage;

When viewing the page source, it looks terrible:

 <?xml version="1.0" charset="utf-8"?><conv><at>6 January 2012 12:00</at><rate>1.56317</rate><from><code>USD</code><curr>Pound Sterling</curr><loc>UK</loc><amnt>23</amnt></from><to><code>GBP</code><curr>United States Dollar</curr><loc>USA</loc><amnt>14.73</amnt></to></conv>

How can I make this so it's properly formatted and indented?

Community
  • 1
  • 1
tctc91
  • 1,343
  • 2
  • 21
  • 41

7 Answers7

5

Add newlines with the \r\n or only \n characters. You'll need to place your strings in double quotes ("") for it to work, so either replace the double-quotes inside the strings with single ones ('), escape the double quotes (\"), add ."\r\n" as a linebreak or use HEREDOC.

Building your XML with a XML generator like the built-in SimpleXML will prevent these sort and numerous other types of problems and is usually far easier than building it by hand with strings.

Gordon
  • 312,688
  • 75
  • 539
  • 559
dtech
  • 13,741
  • 11
  • 48
  • 73
2

You could:

  • Do it yourself by adding whitespace characters to your strings (\n, \t).
  • Output all your XML with a HEREDOC
  • You could create or even generate a DOMDocument and use saveXML()

The first two are quick and dirty (heredoc's better). The latter is more robust, but more code.

Jason McCreary
  • 71,546
  • 23
  • 135
  • 174
1

add a \n after every $xmlpage. You should be able to view it properly after the echo.

e.g.

        $xmlpage = "<?xml version="1.0" charset="utf-8"?>\n";

        $xmlpage .= "<conv>\n";
        $xmlpage .= "<at>6 January 2012 12:00</at>\n";
        $xmlpage .= "<rate>1.56317</rate>\n";
ThinkingMonkey
  • 12,539
  • 13
  • 57
  • 81
1

Use a HEREDOC. it'll be far easier to read than repeated string concatenation, allows tabs/multilines, and does variable interpolation for you:

$xmlpage = <<<EOL
<?xml version="1.0" charset="utf-8"?>
<conv>
    <at>6 January 2012 12:00</at>
    <rate>1.56317</rate>
    <from>
        <code>$from</code>
        <curr>Pound Sterling</curr>
        <loc>UK</loc>
        <amnt>$amnt</amnt>
    </from>
</conv>
EOL;
Marc B
  • 356,200
  • 43
  • 426
  • 500
1

Use a stylesheet and an XML viewer to view it.

James McLeod
  • 2,381
  • 1
  • 17
  • 19
  • A bit overkill right? Such a simple XML should be easily viewable even in a plain-text editor. – dtech Jan 06 '12 at 16:56
  • It's a best practice, regardless of the complexity (or lack thereof) of the XML. And it is not particularly difficult - overkill suggests a heavyweight tool for a lightweight job, and XSLT is not a heavyweight tool, nor are XML viewers difficult to come by these days, so no, I don't think this constitutes overkill. – James McLeod Jan 06 '12 at 23:35
-1

The simplest way would be to add the appropriate whitespace to the beginning of the strings, and the newlines to the ends.

        $xmlpage = '<?xml version="1.0" charset="utf-8"?>';

        $xmlpage .= '<conv>' . "\n";
        $xmlpage .= "\t" . '<at>6 January 2012 12:00</at>' . "\n";
        $xmlpage .= "\t" . '<rate>1.56317</rate>' . "\n";

        $xmlpage .= '<from>' . "\n";
        $xmlpage .= "\t" . '<code>'.$from.'</code>' . "\n";
        $xmlpage .= "\t" . '<curr>Pound Sterling</curr>' . "\n";
        $xmlpage .= "\t" . '<loc>UK</loc>' . "\n";
        $xmlpage .= "\t" . '<amnt>'.$amnt.'</amnt>' . "\n";
        $xmlpage .= '</from>' . "\n";

        $xmlpage .= '</conv>';

Or something along those lines, depending on your desired output.

code_burgar
  • 12,025
  • 4
  • 35
  • 53
-2

Here's my prettify function, which formats for output. You can modify it to suit your needs.

function prettifyXML( $xml )
{
    // Break our XML up into sections of newlines.
    $xml = preg_replace( '/(<[^\/][^>]*?[^\/]>)/', "\n" . '\1', $xml );

    $xml = preg_replace( '/(<\/[^\/>]*>|<[^\/>]*?\/>)/', '\1' . "\n", $xml );
    $xml = str_replace( "\n\n", "\n", $xml );

    $xml_chunks = explode( "\n", $xml );

    $indent_depth = 0;
    $open_tag_regex = '/<[^\/\?][^>]*>/';
    $close_tag_regex = '/(<\/[^>]*>|<[^>]*\/>)/';

    // Fix the indenting.
    foreach ( $xml_chunks as $index => $xml_chunk )
    {
        $close_tag_count = preg_match( $close_tag_regex, $xml_chunk );

        $open_tag_count = preg_match( $open_tag_regex, $xml_chunk );

        if ( $open_tag_count >= $close_tag_count )
        {
            $temp_indent_depth = $indent_depth;
        }
        else
        {
            $temp_indent_depth = $indent_depth - $close_tag_count;
        }

        $xml_chunks[ $index ] = str_repeat( "\t", $temp_indent_depth ) . $xml_chunk;

        $indent_depth += $open_tag_count - $close_tag_count;

    }

    $xml = implode( "\n", $xml_chunks );

    // Add tokens for attributes and values.
    $attribute_regex = '/([\w:]+\="[^"]*")/';
    $value_regex = '/>([^<]*)</';

    $value_span_token = '##@@##@@';
    $attribute_span_token = '@@##@@##';
    $span_close_token = '#@#@#@#@';

    $xml = preg_replace( $value_regex, '>' . $value_span_token . '\1' . $span_close_token . '<', $xml );
    $xml = preg_replace( $attribute_regex, $attribute_span_token . '\1' .$span_close_token, $xml );
    $xml = htmlentities( $xml );

    // Replace the tokens that we added previously with their HTML counterparts.
    $xml = str_replace( $value_span_token, '<span class="value">', $xml );
    $xml = str_replace( $attribute_span_token, '<span class="attribute">', $xml );
    $xml = str_replace( $span_close_token, '</span>', $xml );

    return $xml;
}

It's been relatively well tested to handle edge cases, though it's not highly efficient because it's only for viewing logs.

Jonathan Rich
  • 1,740
  • 10
  • 11
  • XML + regex = [BAD](http://stackoverflow.com/a/1732454/118068) – Marc B Jan 06 '12 at 17:28
  • Marc, I disagree - XML is a regular data structure, perfect for being parsed by regular expressions. When parsing large XML documents, sometimes it's faster to use regexes and it certainly takes less memory than trying to parse the entire document. – Jonathan Rich Jan 06 '12 at 17:31
  • In a perfect world, yes. but mangled data happens VERY frequently and regexes on mangled xml will mangle it even worse. Always program defensively. – Marc B Jan 06 '12 at 17:32
  • If you have invalid XML then the problem isn't running regexes on the XML. You could make the same argument about SQL - corrupted tables happen frequently, and queries on corrupted tables will corrupt them even worse. It's a non sequitur. – Jonathan Rich Jan 06 '12 at 17:35