7

I have HTML that looks like this when you view the source:

Original HTML

<!DOCTYPE html>
<html>
    <head>
    </head>
    <body>
    </body>
</html>

But after I do:

$dom = new DOMDocument();
$dom->loadHTML($html);
$dom->saveHTML();

My source code turns to this:

New HTML

<!DOCTYPE html><html><head></head><body></body></html>

How can I preserve new lines and white space when using the PHP DOMDocument() class and its methods?

GTS Joe
  • 3,612
  • 12
  • 52
  • 94
  • 2
    This isn't really a duplicate - the other question is about how to pretty-print a document that has been built dynamically, but this question is about how to preserve the layout of a document built from an HTML string. The answer may be similar but the use-case is completely different. – HappyDog Jul 22 '19 at 12:45

1 Answers1

2

To preserve the whitespace, try something along these lines:

$dom=new DOMDocument('1.0', 'UTF-8');
$dom->formatOutput=false;
$dom->preserveWhiteSpace=true;

$dom->validateOnParse=false;
$dom->standalone=true;
$dom->strictErrorChecking=false;
$dom->recover=true;

There are a few other propertes you can set - you'll find them in the manual

Professor Abronsius
  • 33,063
  • 5
  • 32
  • 46
  • 1
    I tried all of your suggestions but it's still stripping all of my newlines. – GTS Joe Aug 02 '16 at 05:28
  • 4
    I found that having flags on the loadHTML function made this not work, for example: `$dom->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);` didn't work, while `$dom->loadHTML($content);` *did* work. In fact, preserving the white space is turned on by default. – Skeets Nov 13 '17 at 20:55
  • did you make it work? – Marco May 05 '20 at 16:57