-1

I have html page has images like:

<img src="media/lib/pics/1495343165.jpg" style="width: 600px; height: 400px; margin: 5px;" />

I would like to extract the image name only "1495343165.jpg" to replace whole image tag with

<img src="my/new/path/1495343165.jpg"  />

How can i do that using regex and php?

Thanks

Zuhair Ali
  • 597
  • 1
  • 5
  • 18
  • Take a look at https://stackoverflow.com/questions/138313/how-to-extract-img-src-title-and-alt-from-html-using-php – BDS Jun 04 '17 at 11:21

2 Answers2

1

You can use XPath to only target the img nodes you want:

$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTMLFile($filePath, LIBXML_HTML_NODEFDTD);
// or $dom->loadHTML($htmlString, LIBXML_HTML_NODEFDTD);

$xp = new DOMXPath($dom);

$nodeList = $xp->query('//img[starts-with(@src, "media/lib/pics/")]');

$newPath = 'my/new/path/';

foreach ($nodeList as $node) {
    $imgFileName = basename($node->getAttribute('src'));
    $imgNode = $dom->createElement('img'); // create a new img element to replace the old img node
    $imgNode->setAttribute('src', $newPath . $imgFileName);
    $node->parentNode->replaceChild($imgNode, $node);
}

$result = $dom->saveHTML();

XPath query details:

//   # everywhere in the DOM tree
img  # an img element
[    # open a predicate
starts-with(@src, "media/lib/pics/") # with a src attribute that starts with "media/lib/pics/"
]    # close the predicate
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Thank you for your replay. This pattern works with me: $re = '/src\="((media\/lib\/pics\/)([0-9]*)\.(jpg|png|gif))"/'; $str = ''; preg_match_all($re, $str, $matches); – Zuhair Ali Jun 04 '17 at 12:28
  • @ZuhairAli: Ok, what your pattern isn't able to do: to match src attributes with single quotes or with no quote at all, to match src attributes with whitespaces around the `=`. And what your pattern shouldn't do: to match attributes that aren't the src attribute: ABCDsrc, to match src attributes that are not in an img tag, to match strings that are inside html, javascript, css comments and inside css or javascript strings. Do you really think that using a regex is a good idea? – Casimir et Hippolyte Jun 04 '17 at 12:39
  • Thank Casimir et Hippolyte, I agree with you I will go with you solution :) – Zuhair Ali Jun 05 '17 at 09:13
0

You can use DOMDocument, and basename():

<?php 
$src = '<img src="media/lib/pics/1495343165.jpg" style="width: 600px; height: 400px; margin: 5px;" />';
$doc = new DOMDocument();
$doc->loadHTML($src);

$src = $doc->getElementsByTagName('img')->item(0)->getAttribute('src');

echo '<img src="my/new/path/'.basename($src).'" />';
//<img src="my/new/path/1495343165.jpg" />
?>
Lawrence Cherone
  • 46,049
  • 7
  • 62
  • 106
  • Thank you, in the same html I have other images with different paths , and I am not planning to change them – Zuhair Ali Jun 04 '17 at 11:30