2

Here is line 10:

$doc->loadHTML('<?xml encoding="utf-8" ?>'.$request->getData());

And here is the whole code snippet from my backend:

class ParseImageLinks{
public function __construct(){
}

public function Run(\DataLayer\Gallery\Requests\ParseImageLinks $request){
    $doc = new \DOMDocument('1.0');
    $doc->loadHTML('<?xml encoding="utf-8" ?>'.$request->getData());

    $images = $doc->getElementsByTagName ( "img");

    foreach ($images as $key => $value){
        $src = $value->getAttribute("src");
        $local = false;

        if ($src[0] == 'h') {
            $src = explode("http://", $src)[1];
        }else{
            if ($src[0] == '/') {
                $src = substr($src,1);
                $local = true;
            }
        }

        $parse = explode('/', $src);
        if (count($parse) > 2 && ($local || $parse[0] == $_SERVER['SERVER_NAME'] || $parse[0] == 'localhost:8080')) {
            $image = $parse[count($parse)-1];
            $size = $parse[count($parse)-2];

            $value->setAttribute("src", '/?image='.$image."&size=".$size);
        }
    }

    return $doc->saveHTML();
}
}

I spent several hours searching the web. Here is what I have tried so far:

  • @$doc->loadHTML('<?xml encoding="utf-8" ?>'.$request->getData());

  • libxml_use_internal_errors(true); $doc->loadHTML('<?xml encoding="utf-8" ?>'.$request->getData());

  • I created a test.php file with this

    <?php
    $doc = new \DOMDocument('1.0');
    $doc->loadHTML('<ul><li>text</li>'.
    '<li>&frac12; of this is <strong>strong</strong</li></ul>');
    foreach ($doc->getElementsByTagName('li') as $node)
    {
        echo htmlentities(iconv('UTF-8', 'ISO-8859-1', $node->nodeValue)), "\n";
    }
    ?>
    

    which by the following command php -f test.php > res.html occured in file res.html with text inside it.

  • I also checked if xml module is loaded and it is.

So, if I understand correct DOMDocument() thing workes in general, but not in the file in question. Why?

UPD. I'm not sure, but this seems like a reason for images not to be loaded properly.

llama
  • 73
  • 1
  • 10
  • Why are you adding an `` tag when you are using `loadHTML`? – Niet the Dark Absol Aug 28 '18 at 11:53
  • @NiettheDarkAbsol I inherited this code and just trying to go through it all, fixing bugs on the way...And I'm also a newbie at php. Could you explain why do you think this tag is an excess? On different answers on similar problem I saw this usage - nobody mentioned it was wrong by any means... – llama Aug 28 '18 at 12:00

1 Answers1

4

Unfortunately DOMDocument is still not able to parse an html5 document. You need to deal with a silent error mechanism workaround as shown in these questions:

Alex D
  • 181
  • 1
  • 4