1

I have a html snippet with an image somewhere inside it. I want to replace the value of the src attribute. I.e. get from something like:

<div style="position: relative" class="img-p"><a href="http://politiken.dk/indland/ECE2145750/nu-kommer-loven-om-alkolaase-spritbilister-skal-betale-6000-kr/"><img src="http://multimedia.pol.dk/archive/00802/RB_PLUS_Danskerne___802815p.jpg" width="369" height="253" alt="SPRITKONTROL" /></a></div>

To something like this:

<div style="position: relative" class="img-p"><a href="http://politiken.dk/indland/ECE2145750/nu-kommer-loven-om-alkolaase-spritbilister-skal-betale-6000-kr/"><img src="http://multimedia.pol.dk/archive/00802/SNOOTS.jpg" width="369" height="253" alt="SPRITKONTROL" /></a></div>

I've tried:

$content = preg_replace('/<img\s+src="([^"]+)"[^>]+>/i', '<img src="http://multimedia.pol.dk/archive/00802/SNOOTS.jpg"', $string); 
echo htmlspecialchars($content);

But that removed the width and height and alt attributes.

Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
Morten
  • 33
  • 8

2 Answers2

2

Okay, instead of using regex logic, what about using DOMDocument() instead? This example works for me:

# Source HTML for this example. Broken up into lines for readability.
$html_value = '<div style="position: relative" class="img-p">'
            . '<a href="http://politiken.dk/indland/ECE2145750/nu-kommer-loven-om-alkolaase-spritbilister-skal-betale-6000-kr/">'
            . '<img src="http://multimedia.pol.dk/archive/00802/RB_PLUS_Danskerne___802815p.jpg" width="369" height="253" alt="SPRITKONTROL" />'
            . '</a>'
            . '</div>'
            ;

# The new `img src` URL.
$new_img_src = 'http://multimedia.pol.dk/archive/00802/SNOOTS.jpg';

# Instantiate `DOMDocument()`
$dom = new DOMDocument();

# Laod the HTML into `DOMDocument()`
$dom->loadHTML($html_value);

# Parse the `img` tags.
$img_tags = $dom->getElementsByTagName('img');

# Roll through the `img` tags.
foreach ($img_tags as $tag) {

  # Set the `src` attribute to be the new value.
  $tag->setAttribute('src', $new_img_src);

  # Save the tag into the HTML.
  $dom->saveHTML($tag);
}

# Strip out the DOCTYPE, html & body tags.
$final_tags = preg_replace('~<(?:!DOCTYPE|/?(?:html|body))[^>]*>\s*~i', '', $dom->saveHTML());

# Echo the final tags.
echo $final_tags;
Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
  • It get's wrapped in DOCTYPE and HTML tags. – Morten Nov 28 '13 at 22:36
  • Try it now. Using `preg_replace` to strip all that other stuff out. Got that tip from here. http://stackoverflow.com/questions/11216726/php-domdocument-without-the-dtd-head-and-body-tags – Giacomo1968 Nov 28 '13 at 22:42
0

Give this one a shot: <img\s+src=["']([^'"]+)["']

(replace with: <img src="http://multimedia.pol.dk/archive/00802/SNOOTS.jpg")

Converts this:

<img src="http://multimedia.pol.dk/archive/00802/RB_PLUS_Danskerne___802815p.jpg" width="369" height="253" alt="SPRITKONTROL" />

to this:

<img src="http://multimedia.pol.dk/archive/00802/SNOOTS.jpg" width="369" height="253" alt="SPRITKONTROL" />

Here's a working example: http://regex101.com/r/uE6oG5

brandonscript
  • 68,675
  • 32
  • 163
  • 220
  • I saw your working example at the link you gave me... Impressing ... Could you show me how to put the regex into a PHP preg_replace? – Morten Nov 28 '13 at 22:37
  • Take his example, and just change your `preg_replace` to this: `preg_replace('/ – Giacomo1968 Nov 28 '13 at 22:48