1

I have in PHP expression:

preg_match("/\<div id=\"servertc\">(.+)\<\/div>/",$data,$out);

And in my $data contain:

<div id="servertc">nowy serwer evolution!<br><br>
~ Server Info ~<br>
IP: axera.pl (Port: 7171)<br>
Online: 24/7<br>
World type: PVP (Protection level: &gt;100)<br>
House rent: disabled.<br>
~ Rates ~<br>
Experience From Player: x2<br>
Magic Level: x15<br>
Skills: x30<br>
Loot: x3<br>
Spawn: x3<br>
Houses: 100 level<br>
Guilds: 8 level (Create via website)<br>
Red Skull (24h): 25 unjustified kills per a day<br>
Black Skull (48h): 50 unjustified kills per a day<br>
Idle kick time = 15 minut<br>
~ Exp stages ~<br>
1-50: x 650<br>
51-75: x 450<br>
76-100: x 300<br>
101-150: x 150<br>
151-175: x 100<br>
176-190: x 75<br>
191-230: x 35<br>
231-250: x 20<br>
251-280: x 15<br>
281-300: x 8<br>
301 +: x 2 
</div>

Scripts return empty array. Where is my problem?

Delimitry
  • 2,987
  • 4
  • 30
  • 39

3 Answers3

1

You need to use PCRE_DOTALL (s) flag in order to make dot match a newline:

/\<div id=\"servertc\">(.+?)\<\/div>/s

However let me warn you that parsing HTML is a bad idea using RegEx. Better use DOM Parser for parsing HTML text like yours.

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • If the HTML input is well known there is no problem. Its even better considering HTML and XML parsers, specially DOMDocument, is extremely slow and CPU consuming. – Havenard Apr 19 '13 at 20:33
  • @Havenard: It might sound like working initially but HTML is so irregular that it can break at unexpected times. – anubhava Apr 19 '13 at 20:34
  • Only if you don't know the origin of your HTML, in this case yes you can't use something simpleton like RegEx to do it, but sinse he knows where he is putting his spoon he can safely rely on it to extract data he needs, at last until the site changes its design. – Havenard Apr 19 '13 at 20:37
  • `at last until the site changes its design` That is what I also meant to highlight. – anubhava Apr 19 '13 at 20:38
  • And with DOM he wouldnt have to change his code? Now you are being silly. – Havenard Apr 19 '13 at 20:39
  • You will find thousand of those silly contributors on SO :) [Enlighten yourself](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – anubhava Apr 19 '13 at 20:42
  • 1
    @Havenard - It depends on the changes. If the HTML changed from `
    ` to `
    `, the regex approach will fail. But a DOM approach that is latching onto the `//div[@id="servertc"]` will continue to function perfectly.
    – nickb Apr 19 '13 at 20:42
1

Here is a regular expression you can use in this case:

preg_match("#<div id=\"servertc\">(.+)</div>#is", $data, $out);

In this case the # sign acts as a delimiter. The i flag makes the regex case-insensitive and the s flag tells it to ignore newline characters. What I like about using the # as a delimiter is that you don't have to escape the <, > and = characters, which is very convenient when working with HTML code, like in your example.

So if you use the regex I'm suggesting, your $out[1] will contain:

nowy serwer evolution!<br>
<br>
~ Server Info ~<br>
IP: axera.pl (Port: 7171)<br>
Online: 24/7<br>
World type: PVP (Protection level: &gt;100)<br>
House rent: disabled.<br>
~ Rates ~<br>
Experience From Player: x2<br>
Magic Level: x15<br>
Skills: x30<br>
Loot: x3<br>
Spawn: x3<br>
Houses: 100 level<br>
Guilds: 8 level (Create via website)<br>
Red Skull (24h): 25 unjustified kills per a day<br>
Black Skull (48h): 50 unjustified kills per a day<br>
Idle kick time = 15 minut<br>
~ Exp stages ~<br>
1-50: x 650<br>
51-75: x 450<br>
76-100: x 300<br>
101-150: x 150<br>
151-175: x 100<br>
176-190: x 75<br>
191-230: x 35<br>
231-250: x 20<br>
251-280: x 15<br>
281-300: x 8<br>
301 +: x 2 

I hope this is what you were looking for and I hope that this answer has helped you.

maringtr
  • 348
  • 3
  • 11
0

According to the documentation, . does not match newline.

Bart Friederichs
  • 33,050
  • 15
  • 95
  • 195