1

I have this html in my database:

<p>some text 1</p>
<img src=\"http://www.example.com/images/some_image_1.jpg\">
<p>some text 2</p>
<p>some text 3</p>
<img src=\"http://www.example.com/images/some_image_2.jpg\">
<p>some text 4</p>
<p>some text 5</p>
<img src=\"http://www.example.com/images/some_image_3.jpg\">

Conditionally, I need to remove some specific <img> tag. So I don't want to remove all <img> tags, but only specific ones.

I have tried this, but it will remove all <img> tags, even if I do not want that:

$dom = new \DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($html);

$nodes = $dom->getElementsByTagName("img");

for($i = 0; $i < $nodes->length; $i++) {
    if ($i == 1) {
        continue;
    }
    $image = $nodes->item($i);
    $image->parentNode->removeChild($image);
}

return $dom->saveHTML();

Can someone help me with this ? In this html example, let's say that I want to remove first and third image in text, but to leave second one.

Also, I have noticed that saveHTML() method is adding <html><body> tags to my html, and I do not want that. I don't see any option to turn this off. Any help there too ?

Thanks in advance, I'm stuck with this for hours.

offline
  • 1,589
  • 1
  • 21
  • 42

2 Answers2

1

You can do this by using array. I modified your code this will not remove second img tag.

$dom = new \DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($html);

// Declare array with numeric vlaues
$remainImages = array(1);

$nodes = $dom->getElementsByTagName("img");

  for($i = 0; $i < $nodes->length; $i++) {
    if (!in_array($i,$remainImages) {
        $image = $nodes->item($i);
        $image->parentNode->removeChild($image);
     }  
}

return $dom->saveHTML();
  • Your code will keep second and third image for some reason. I have found a way to make it work. Inside the for loop I create the array of images to remove. Then in additional foreach loop I go through that array and remove images. – offline Aug 18 '16 at 07:48
  • Yes second will image will remain above in code but not third. So you need to make sure 2 is not added into this array. `$remainImages = array(1);` – Tanveer Hussain Aug 18 '16 at 08:03
  • It is not working as intended. Look here: http://phpfiddle.org/main/code/rkim-st8w , run the code. – offline Aug 18 '16 at 08:22
  • I found issue in my code, please pass 2 into array if you want to retain second image like below. `$remainImages = array(2);` So please if you want to retain 8 and 14 then `$remainImages = array(8,14);` – Tanveer Hussain Aug 18 '16 at 08:47
1

there are option to avoid adding html and body tag when you want to load an html file or content:

$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
@$dom->loadHTML(file_get_contents('file.html'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
//@$dom->loadHTMLFile('file.html'); //Adds Html and body tags if not exist at the beginning

$nodes = $dom->getElementsByTagName("img");

foreach($nodes as $i => $node){
    if ($i == 1) {
        continue;
    }
    $image = $nodes->item($i);
    $image->parentNode->removeChild($image);
}

return $dom->saveHTML();
//$dom->saveHtmlFile('file.html');

some answers close to your question's answer which used in this answer:

  1. To delete element(you already used): https://stackoverflow.com/a/15272752/3086860
  2. To avoid putting extra tags: https://stackoverflow.com/a/22490902/3086860
Community
  • 1
  • 1
Saeed.Gh
  • 1,285
  • 10
  • 22