0

I have a bit of a situation. The site am working on has two sections the mobile and the main site. They both fetch content from the same db/table. Its a blog-site. When admins create content that has images using the text editor (CKEditor), the style attribute is attached to the resulting img tag. so the output looks like this.

<img alt="some content" src="some location" style="width:520px; height:600px;" />

this works great on the main site but on the mobile site the images are poorly scaled and stretched. i have a thumbnailing script that could address that but i want a way to get the src attribute before the page loads and a way to remove the style attribute.

i did this using regex.

$str=$blog_post_column_from_database

$pattern=array ('#\<img alt="(.*?)" src="(.*)" style="(.*?)" /> #' );

$replacement=array ( '<img src="$my_thumbnailer_here.php?src=\\2" width="100%" />' );

$a=(string)$str; //converts text to string to avoid code lines from executing

return preg_replace($pattern,$replacement,$a);

please what am i doing wrong?..Regex is not my strong points thanks.

jcobhams
  • 796
  • 2
  • 12
  • 29
  • 5
    regexes on html should be avoided. Use [DOM](http://php.net/dom) instead. – Marc B Nov 20 '13 at 15:25
  • http://stackoverflow.com/questions/5517255/remove-style-attribute-from-html-tags – gherkins Nov 20 '13 at 15:25
  • @MarkResølved thanks for the link..it works well but does not give me an option to place my thumnbail variable...dont really know how to hack it...:) – jcobhams Nov 20 '13 at 15:46
  • @MarcB thanks for the link...will look in to that but however i need a quick fix for now..once i get around using php DOM will switch...Thanks all the same – jcobhams Nov 20 '13 at 15:47

2 Answers2

1

...as already suggested in the comments, you'll be better off using PHPs DOMDocument:

Something like this should do the trick:

example: http://3v4l.org/Gv4dp

//get new domdoc instance
$dom=new DOMDocument();

//load your html
$dom->loadHTML($your_html);

//get all images
$imgs = $dom->getElementsByTagName("img");

//iterate over those
foreach($imgs as $img){
    //remove style attribute
    $img->removeAttribute('style');
    //prefix src attribute with scriptname
    $img->setAttribute( 'src' , 'thumbnail.php?img=' . $img->getAttribute('src') );
}

//output modified html
echo $dom->saveHTML();

you might want to remove the <doctype>, <html> and <body> elements, created when saving the doc as html by replacing the last line with:

echo preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), '', $dom->saveHTML()));

see removing doctype while saving domdocument

Community
  • 1
  • 1
gherkins
  • 14,603
  • 6
  • 44
  • 70
0

Try next regexp

$pattern=array ('#<img alt="(.*?)" src="(.*)" style="(.*?)" />#' );

There is removed / from begin and space from end.

And for correct work you should in first find all img tags and then change it.

Your regexp will not work attribute tag alt is missed or when attributes are in other orders

newman
  • 2,689
  • 15
  • 23