How to get this regular expression to match

Question

This is the sort of HTML string I will be performing matches on:

<span class="q1">+12 Spell Power and +10 Hit Rating</span>

I want to get +12 Spell Power and +10 Hit Rating out of the above HTML. This is the code I wrote:

preg_match('/<span class="q1">(.*)<\/span>/', $gem, $match);

But due to <\/span> it's escaping the / in </span> so it doesn't stop the match, so I get a lot more data than what I want.

How can I escape the / in </span> while still having it part of the pattern?

Thanks.

score 3 · Accepted Answer · answered Jun 20 '10 at 00:48

3

I think the reason that your regex is getting more than you want is because * is greedy, matching as much as possible. Instead, use *?, which will match as little as possible:

preg_match('/<span class="q1">(.*?)<\/span>/', $gem, $match);

answered Jun 20 '10 at 00:48

davidscolgan

7,508
9
59
78

That works thanks. Reason I don't want to use the DOMDocument class is that it's a very small piece of HTML and this code will only be run once, I'm collecting data to be put into a database. No need to complicate things. :) – VIVA LA NWO Jun 20 '10 at 00:52

score 2 · Answer 2 · answered Jun 20 '10 at 00:46

Don't use regex to parse HTML
use DOM, particularly the loadHTML method and getElementsByTagName('span')

-

    $doc = new DOMDocument();
    $doc->loadHTML($htmlString);
    $spans = $doc->getElementsByTagName('span');
    if ( $spans->length > 0 ) {
     // loop on $spans
    }

score 2 · Answer 3 · edited May 23 '17 at 12:26

2

Don't use regex to parse HTML. Use an HTML parser. See Robust, Mature HTML Parser for PHP.

edited May 23 '17 at 12:26

Community

1
1

answered Jun 20 '10 at 00:47

Jason

86,222
15
131
146

How to get this regular expression to match

3 Answers3