0

I have a example:

<a href="http://test.html" class="watermark" target="_blank">
   <img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
</a>

I using preg_replace to change value class of a tag and src of img tag

$content = preg_replace('#<a(.*?)href="([^"]*/)?(([^"/]*)\.[^"]*)"([^>]*?)><img(.*?)src="([^"]*/)?(([^"/]*)\.[^"]*)"([^>]*?)></a>#', '<a href=$2$3 class="fancybox"><img$1src="http://test.html/uploads/2013/10/10_new.jpg"></a>', $content); 

How to result is ?

<a href="http://test.html" class="fancybox" target="_blank">
    <img width="399" height="4652" src="http://test.html/uploads/2013/10/10_new.jpg" class="aligncenter size-full wp-image-78360">
</a>
Hai Truong IT
  • 4,126
  • 13
  • 55
  • 102

2 Answers2

1

Regex, as is mentioned many times daily here on SO, is not the best tool for HTML manipulation - luckily we have the DOMDocument object!

If you're supplied with just that string you can make the changes like so:

$orig = '   <a href="http://test.html" class="watermark" target="_blank">
                <img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
        </a>';
$doc = new DOMDocument();
$doc->loadHTML($orig);
$anchor = $doc->getElementsByTagName('a')->item(0);
if($anchor->getAttribute('class') == 'watermark')
{
    $anchor->setAttribute('class','fancybox');
    $img = $anchor->getElementsByTagName('img')->item(0);
    $currSrc = $img->getAttribute('src');
    $img->setAttribute('src',preg_replace('/(\.[^\.]+)$/','_new$1',$currSrc));
}
$newStr = $doc->saveHTML($anchor);

Else if you're using a full document HTML source:

$orig = '<!DOCTYPE html>
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title></title>
</head>
<body>
    <a href="http://test.html" class="watermark" target="_blank">
        <img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
    </a>
    <span>random</span>
    <a href="http://test.html" class="watermark" target="_blank">
        <img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
    </a>
    <a href="#foobar" class="gary">
        <img src="/imgs/yay.png" />
    </a>
</body>
</html>';
$doc = new DOMDocument();
$doc->loadHTML($orig);
$anchors = $doc->getElementsByTagName('a');
foreach($anchors as $anchor)
{
    if($anchor->getAttribute('class') == 'watermark')
    {
        $anchor->setAttribute('class','fancybox');
        $img = $anchor->getElementsByTagName('img')->item(0);
        $currSrc = $img->getAttribute('src');
        $img->setAttribute('src',preg_replace('/(\.[^\.]+)$/','_new$1',$currSrc));
    }
}
$newStr = $doc->saveHTML();

Although for brain exercise, I've provided a regex solution as that was the original question, and sometimes DOM docs can be overkill amounts of code (though still preferable)

$newStr = preg_replace('#<a(.+?)class="watermark"(.+?)<img(.+?)src="(.+?)(\.[^.]+?)"(.*?>.*?</a>)#s','<a$1class="fancybox"$2<img$3src="$4_new$5"$6',$orig);
MDEV
  • 10,730
  • 2
  • 33
  • 49
0

Don't parse HTML with regex.

Find all links in html that have watermark class, change class to fancybox, and update first child image src.

$dom = new DOMDocument;
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//a[contains(@class, "watermark")]') as $a) {
    $a->setAttribute('class', 'fancybox');

    $img = $xpath->query('descendant::img', $a)->item(0);
    # old value = $img->getAttribute('src');
    $img->setAttribute('src', 'new_value');
}
echo $dom->saveHTML();
Community
  • 1
  • 1
Glavić
  • 42,781
  • 13
  • 77
  • 107