3

I wanted to parse following html tags contents retrieved through curl by regular expressions.

<span class='ui-allscores'>IND - 203/9 (49.4 Ovs)</span>

so that output will be "IND - 203/9 (49.4 Ovs)".

I have written following code but it is not working.

$one="<span class='ui-allscores'>IND - 203/9 (49.4 Ovs)</span>";
$five="~(?<=<span class='ui-allscores'>)[.]*(?=</br></span>)~";
preg_match_all($five,$one,$ui);
print_r($ui);
Cœur
  • 37,241
  • 25
  • 195
  • 267
viki
  • 203
  • 2
  • 4
  • 10

3 Answers3

7

Try this one:

$string = "<span class='ui-allscores'>IND - 203/9 (49.4 Ovs)</span>";

Dynamic span tag:

preg_match('/<span[^>]*>(.*?)<\/span>/si', $string, $matches);

Specific span tag:

preg_match("/<span class='ui-allscores'>(.*?)<\/span>/si", $string, $matches);

// Output
array (size=2)
  0 => string '<span class='ui-allscores'>IND - 203/9 (49.4 Ovs)</span>' (length=56)
  1 => string 'IND - 203/9 (49.4 Ovs)' (length=22)
Bora
  • 10,529
  • 5
  • 43
  • 73
1

If you simply want to remove the HTML tags, Use the php built-in function strip_tags to remove the html tags.

Another answer on removing html tags Strip all HTML tags, except allowed

Community
  • 1
  • 1
Joshua Kissoon
  • 3,269
  • 6
  • 32
  • 58
1

The problem of your regex is the [.] part. This is matching only a literal ., because the dot is written inside a character class. So just remove the square brackets.

 $five="~(?<=<span class='ui-allscores'>).*(?=</br></span>)~";

The next problem then is the greediness of *. You can change this matching behaviour by putting a ? behind.

$five="~(?<=<span class='ui-allscores'>).*?(?=</br></span>)~";

But the overall point is: You should most probably use a html parser for this job!

See How do you parse and process HTML/XML in PHP?

Community
  • 1
  • 1
stema
  • 90,351
  • 20
  • 107
  • 135