-2

I'd hoped to get all of the content between the tag a with class bc-de fg in a HTML file, like this:

<a class="bc-de fg"> XXXXXXXXXXXXX </a>

So I wrote the following regular expression:

$regexp = “<a\wclass="bc\wde">(.*?)<\/a>”

This does't work. I'm new to regular expressions, so I am trying to get more practice.

How can I correct this regular expression?

johnsyweb
  • 136,902
  • 23
  • 188
  • 247
lkkeepmoving
  • 2,323
  • 5
  • 25
  • 31

5 Answers5

0

Try something like $regex = ":<a class="bc-de fg">(.*?)</a>:mi";

  • I used the delimeter : So I don't have to escape forward slashes all the time
  • . Matches any single character except line break characters \r and \n.
  • *? Repeats the previous item zero or more times.

Here's a simple cheatsheet you might find useful.

Nick Fury
  • 1,313
  • 3
  • 13
  • 23
0

It is considered a bad practice to regex over html, or any nested structures. Use DOM instead.

Your problem in regex is the escape characters (put the regex string between single quotes). And you also have no starting and closing regex characters (#...# or \...\).

cth
  • 188
  • 7
0

Try this :

$str  = '<a class="bc-de fg">Testing</a>';

preg_match('/<a class="bc\-de\s*fg">(?P<link>.*)<\/a>/',$str,$matches);

echo "<pre>";
print_r($matches);

You will get link in $link = $matches['link'];

This will give you more accurate results :

preg_match('/<a.*class="bc\-de\s*fg".*>\s*(?P<link>.*)\s*<\/a>/',$str,$matches);

Prasanth Bendra
  • 31,145
  • 9
  • 53
  • 73
0

Try this:

$regexp = '/<a class="bc-de fg">(.*)<\/a>/';
preg_match_all($regexp, $subject, $matches);

You answer will be in $matches . It should work int the scenario you just mentioned. But if the case is, if order of attributes changes or more classes are assigned, this regex wont work. Best way to do this is to use DOM instead of using regex.

ksg91
  • 1,279
  • 14
  • 34
0

Try [^(<a\W*class="bc\-de fg"\W*>)+(</a>)+] You can use the not ^ operator .

Amy
  • 7,388
  • 2
  • 20
  • 31