1

I want to modify the contents of an html file with php. I am applying style to img tags, and I need to check if the tag already has a style attribute, if it has, I want to replace it with my own.

$pos = strpos($theData, "src=\"".$src."\" style=");
    if (!$pos){
        $theData = str_replace("src=\"".$src."\"", "src=\"".$src."\" style=\"width:".$width."px\"", $theData);
    }
    else{
        $theData = preg_replace("src=\"".$src."\" style=/\"[^\"]+\"/", "src=\"".$src."\" style=\"width: ".$width."px\"", $theData);
    }

$theData is the html source code I receive. If a style attribute has not been found, I successfully insert my own style, but I think the problem comes when there is already a style attribute defined so my regex is not working.

I want to replace the style attribute with everything inside it, with my new style attribute. How should my regex look?

user4035
  • 22,508
  • 11
  • 59
  • 94
Morne
  • 1,623
  • 2
  • 18
  • 33

4 Answers4

4

Instead of using regex for this, you should use a DOM parser.

Example using DOMDocument:

<?php
$html = '<img src="http://example.com/image.jpg" width=""/><img src="http://example.com/image.jpg"/>';

libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML('<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />'.$html);
$dom->formatOutput = true;

foreach ($dom->getElementsByTagName('img') as $item)
{
    //Remove width attr if its there
    $item->removeAttribute('width');

    //Get the sytle attr if its there
    $style = $item->getAttribute('style');

    //Set style appending existing style if necessary, 123px could be your $width var
    $item->setAttribute('style','width:123px;'.$style);
}
//remove unwanted doctype ect
$ret = preg_replace('~<(?:!DOCTYPE|/?(?:html|body|head))[^>]*>\s*~i', '', $dom->saveHTML());
echo trim(str_replace('<meta http-equiv="Content-Type" content="text/html;charset=utf-8">','',$ret));

//<img src="http://example.com/image.jpg" style="width:123px;">
//<img src="http://example.com/image.jpg" style="width:123px;">

?>
Lawrence Cherone
  • 46,049
  • 7
  • 62
  • 106
  • I have never used a DOM parser, what will be the advantage of using this over a string_replace set-up as I am currently using? – Morne Jun 19 '13 at 08:55
  • Simply put **Y̷̙o҉​͟u ̛s҉h҉Oͮ͏̮µl͎̫̗͚͖͙̪͆̔̊ͭͩ̃d̑͢ ̒ͩͣ̅̓͒̀ͤ̂͂̄̊҉̶̝̗̦͡͠ƞƐV̈́̂̈́e̶R̘̝̙ͤ͂̾̆ P̯͍̭@R̘̝̙ͤ͂̾̆$̝ͤ͂̾̆Eͧ̾ͬ͛ͪ̈́ ̼̮̳̩̃̓̍̇͞H̸̡̪̯ͨ͊̽̅̾̎7mͧ̾ͬl͎̫̗͚͖͙̪͆̔̊ͭͩ̃ ŵi̍̈́̂̈́T̈́̂̈́h҉ R̗̹̥̊̂ȇ͝G҉ËX̚͜** period. There are to many areas that can cause your regex to fail. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags?rq=1 – Lawrence Cherone Jun 19 '13 at 08:58
  • 2
    To elaborate, In your selected answer what happens if the style tag comes before the src or there is an extra space between attributes `" style="`. – Lawrence Cherone Jun 19 '13 at 09:06
1

Here is the regexp variant of solving this problem:

<?php
$theData = "<img src=\"/image.png\" style=\"lol\">";
$src = "/image.png";
$width = 10;

//you must escape potential special characters in $src, 
//before using it in regexp
$regexp_src = preg_quote($src, "/");

$theData = preg_replace(
    '/src="'. $regexp_src .'" style=".*?"/i',
    'src="'. $src .'" style="width: '. $width . 'px;"',
    $theData);

print $theData;

prints:

<img src="/image.png" style="width: 10px;">
user4035
  • 22,508
  • 11
  • 59
  • 94
0

Regex expression:

(<[^>]*)style\s*=\s*('|")[^\2]*?\2([^>]*>)

Usage:

$1$3

Example:

http://rubular.com/r/28tCIMHs50

Noqomo
  • 168
  • 6
0

Search for:

<img([^>])style="([^"])"

and replace with:

<img\1style="attribute1: value1; attribute2: value2;"

http://regex101.com/r/zP2tV9

Chirag Bhatia - chirag64
  • 4,430
  • 3
  • 26
  • 35