1

I want to find NUMBER, but skip H1 , H2 , H3 and so on.. (all possible HTML heading variants)

Example 1:

<div>Today is good day. I got<h3>3<span> lotto tickets</span></h3></div>

Example 2:

I want to buy lotto tickets. <h1>Maybe 10 is enough</h1>

Example 3:

I want to buy lotto tickets. <h1>4 or 5</h1> is enough.

I have this code:

lotto tickets\D{0,15}(\d+\,\d+|\d+\.\d+|\d+)

But every time i get numbers from HTML tag.. <h3> (3) , <h1> (1). How i can skip them?

In example 1 i should get nothing

In example 2 i should get number 10

In example 3 i should get number 4

(Numbers can be with . or , example: 2.5)

Pastuh
  • 376
  • 2
  • 12

2 Answers2

1

This is one of those instances where perhaps regex isn't being used correctly.

Yes, you could it just with regex, but a easier way to do it (as well as being faster to run), would be to run strip_tags() on your string first to get rid of all the HTML tags, and then just do a standard regex for the numbers.

$string = "<h3>This is post number 10</h3>";
$cleanString = strip_tags($string);
preg_match("%\b[0-9]+\b%",$cleanString,$number);
Liam Wiltshire
  • 1,254
  • 13
  • 26
0

You should use the following regex:

<h[1-6]>[^\d\<]*(\d+)[^\<]+<\/h[1-6]>
Mayur Koshti
  • 1,794
  • 15
  • 20