How to skip html headings and find number with regex?

Question

I want to find NUMBER, but skip H1 , H2 , H3 and so on.. (all possible HTML heading variants)

Example 1:

<div>Today is good day. I got<h3>3<span> lotto tickets</span></h3></div>

Example 2:

I want to buy lotto tickets. <h1>Maybe 10 is enough</h1>

Example 3:

I want to buy lotto tickets. <h1>4 or 5</h1> is enough.

I have this code:

lotto tickets\D{0,15}(\d+\,\d+|\d+\.\d+|\d+)

But every time i get numbers from HTML tag.. <h3> (3) , <h1> (1). How i can skip them?

In example 1 i should get nothing

In example 2 i should get number 10

In example 3 i should get number 4

(Numbers can be with . or , example: 2.5)

6 years old and still relevant : http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 — CD001, Nov 17 '15 at 16:15

score 1 · Accepted Answer · answered Nov 17 '15 at 16:30

This is one of those instances where perhaps regex isn't being used correctly.

Yes, you could it just with regex, but a easier way to do it (as well as being faster to run), would be to run strip_tags() on your string first to get rid of all the HTML tags, and then just do a standard regex for the numbers.

$string = "<h3>This is post number 10</h3>";
$cleanString = strip_tags($string);
preg_match("%\b[0-9]+\b%",$cleanString,$number);

score 0 · Answer 2 · answered Nov 18 '15 at 06:10

0

You should use the following regex:

<h[1-6]>[^\d\<]*(\d+)[^\<]+<\/h[1-6]>

answered Nov 18 '15 at 06:10

Mayur Koshti

1,794
15
20

How to skip html headings and find number with regex?

2 Answers2