0

I have html code in a variable. For example $html equals:

<div title="Cool stuff" alt="Cool stuff"><a title="another stuff">......</a></div>

I need to replace content of all title attributes title="Cool stuff" and title="anot stuff" and so on with title="$newTitle".

Is there any non-regex way to do this?

And if I have to use regex is there a better(performance-wise) and/or more elegant solution than what I came up with?

$html = '...'
$newTitle = 'My new title';

$matches = [];
preg_match_all(
    '/title=(\"|\')([^\"\']{1,})(\"|\')/',
    $html,
    $matches
);
$attributeTitleValues = $matches[2];

foreach ($attributeTitleValues as $title)
{
    $html = str_replace("title='{$title}'", "title='{$newTitle}'", $html);
    $html = str_replace("title=\"{$title}\"", "title=\"{$newTitle}\"", $html);
}
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Marek Barta
  • 344
  • 1
  • 4
  • 17
  • 2
    You should make use of `SimpleXMLElement()` to convert the html into an object so that you can XPath over the nodes which have `title="whatever"`. See https://stackoverflow.com/a/65206705/2191572 to get started. – MonkeyZeus Dec 15 '20 at 19:44
  • @MonkeyZeus Ah, I didn't see the banner "There are disputes about this answer’s content ...". I've just seen that answer so many times when questions about Regex and HTML are asked – Tim Lewis Dec 15 '20 at 19:44
  • 1
    To get all nodes with a title then this could be used `//*[@title]` – MonkeyZeus Dec 15 '20 at 19:53
  • 1
    @MonkeyZeus thank you. That seems like the kind of solution I was looking for. – Marek Barta Dec 16 '20 at 17:13
  • I can not accept this as correct answer so I will accept the other one(which was later) to close this up. But thank you again, – Marek Barta Jan 13 '21 at 16:08

1 Answers1

1

Definitely don't use regex -- it is a dirty rabbit hole.
...the hole is dirty, not the rabbit :)

I prefer to use DomDocument and Xpath to directly target all title attributes of all element in your html document.

  • LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD flags are in place to prevent your output being garnished with <doctype> and <html> tags.
  • // in the XPath expression says: go to any depth in search of matches

Code: (Demo)

$html = <<<HTML
<div title="Cool stuff" alt="Cool stuff"><a title="another stuff">......</a></div>
HTML;
$newTitle = 'My new title';

$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//@title') as $attr) {
    $attr->value = $newTitle;
}
echo $dom->saveHTML();

Output:

<div title="My new title" alt="Cool stuff"><a title="My new title">......</a></div>
mickmackusa
  • 43,625
  • 12
  • 83
  • 136