0

For example I have string like:

#resultStats{opacity:0;top:13px}</style><div id="extabar"><div id="topabar" style="position:relative"><div class="ab_tnav_wrp" id="slim_appbar"><div id="sbfrm_l"><div id="resultStats">About 5,320 results<nobr> (0.13 seconds)&nbsp;</nobr></div></div></div></div><div id="botabar" style="display:none"></div></div><div></div></div><div class="mw" data-jibp="h" data-jiis="uc" id="ucs"></div><div class="mw"><div data-jibp="h" data-jiis="uc" id="akp"></div><div id="rcnt" style="clear:both;position:relative;zoom:1">

I need to get 5,320 from it. Do like this <div id="resultStats">(\d+(?:,\d+))<\/div>.

P.S. I need to extract especially from <div id="resultStats">

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
user1692333
  • 2,461
  • 5
  • 32
  • 64

1 Answers1

3

Parsing html tags with regex is doable, but not
recommended for the overall html language.

If you have to, this is a simplistic exmple

<div\s+id\s*=\s*"resultStats"\s*>[^<]*?(\d(?:,?\d)*)[^<]*?<

Formatted:

 <div \s+ id \s* = \s* "resultStats" \s* >
 [^<]*? 
 (                             # (1 start)
      \d 
      (?: ,? \d )*
 )                             # (1 end)
 [^<]*? <

Output:

  **  Grp 0 -  ( pos 238 , len 42 ) 
 <div id="resultStats">About 5,320 results<
  **  Grp 1 -  ( pos 266 , len 5 ) 
 5,320