I am looking for an easy and efficient way to remove a specific image from an article. All that I know is the image URL of the image that I need to remove.
- The image may or may not use different attributes.
- The image may or may not exist at all in the article.
- There might be other images (not same url) in the article.
My choice would be either regex or DOMDocument
, probably using an HTML5 parser like https://github.com/Masterminds/html5-php.
My regex skills are not that good, and I'm not sure if it's a good idea to use regex to accomplish this because I read that regex should be avoided to parse HTML. What I have with so far with regex, is to remove the complete image, but not sure how to remove it based on a specific src url.
$img_src = 'http://www.example.org/image_to_be_removed.jpg';
$article = '<h1>Test article with HTML5 tags</h1>
<nav><a href="/link1/">Link 1</a></nav>
<p>This is an example article. The article may or may not include html5 tags, images and other things.</p>
<img src="http://www.example.org/image_to_be_removed.jpg">
<p>More example text.</p>';
$article = preg_replace("/<img[^>]+\>/i", "", $article);
echo $article;
I haven't dug into the DOMDocument solution yet, because I am not sure if it's even possible or if regex might be considered best practice?