-1

This is my Regex to fetch all tags with class:

preg_match_all('/<\s*\w*\s*class\s*=\s*"?\s*([\w\s%#\/\.;:_-]*)\s*"?.*?>/',file,$matches);

It matches all tags with class like <a class="abc">

The problem is that if any tag contains extra attribute before class than this Regex are unable to get it.

E.g.: <a id="fig_3_1" class="figure-contents">

I want <a class="figure-contents"> by ignore fig_3_1

Any idea to exclude it?

sandesh
  • 390
  • 6
  • 20
Aarush Sen
  • 39
  • 6

2 Answers2

0
<\s*\w*.*?\s*class\s*=\s*"?\s*([\w\s%#\/\.;:_-]*)\s*"?.*?>

Probably this works but you better use simple_html_dom

quAnton
  • 776
  • 6
  • 10
0

Take a look at this amazing SO post and reconsider.

You will most likely be better of using a html parser instead. You can do so using the DOM model.

A simple sample of how it can be used below.

$dom = new DOMDocument;
$dom->loadHTML($html);
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
    $image->setAttribute('src', 'http://example.com/' .$image->getAttribute('src'));
}
$html = $dom->saveHTML();
Community
  • 1
  • 1
MX D
  • 2,453
  • 4
  • 35
  • 47