1

i'm learning Regex but can't figure it out.... i want to get the entire HTML from a DIV, how to procced?

already tried this;

/\< td class=\"desc1\"\>(.+)/i

it returns;

Array
(
[0] => < td class="desc1">
[1] => 
)

the code that i'm matching is this;

<table id="profile" cellpadding="1" cellspacing="1">
<thead>
<tr>
<th colspan="2">Jogador TheInFEcT </th>
</tr>
<tr>
<td>Detalhes</td>
<td>Descrição:</td>

</tr>
</thead><tbody>
<tr>
<td class="empty"></td><td class="empty"></td>
</tr>
<tr>
<td class="details">
<table cellpadding="0" cellspacing="0">
<tbody><tr>

<th>Classificação</th>
<td>11056</td>
</tr>
<tr>
<th>Tribo:</th>
<td>Teutões</td>
</tr>

<tr>
<th>Aliança:</th>
<td>-</td>
</tr>
<tr>
<th>Aldeias:</th>
<td>1</td>

</tr>
<tr>
<th>População:</th>
<td>2</td>
</tr><tr>
<td colspan="2" class="empty"></td>
</tr>
<tr>

<td colspan="2"> <a href="spieler.php?s=1">» Alterar perfil</a></td>
</tr>

</tbody></table>

</td>
<td class="desc1">
<div>STATUS: OFNAaaaAA</div>
</td>

</tr>
</tbody>
</table>

i need to get the entire code inside the < td class="desc1">, like that;

<div >STATUS: OFNAaaaAA< /div>
</td>

</tr>
</tbody>
</table>

Could someone help me out?

Thanks in advance.

Lucas
  • 11
  • 1

2 Answers2

5

I usually use

$dom = DOMDocument::load($htmldata);

for converting HTML code to XML DOM. And then you can use

$node = $dom->getElementsById($id); 
/* or */
$nodes = $dom->getElementsByTagName($tag); 

to get your HTML/XML node.
Now, use

$node->textContent

to get data inside node.

ajreal
  • 46,720
  • 11
  • 89
  • 119
erinus
  • 734
  • 4
  • 5
0

try this, it does not cover all possible cases but it should work:

/<td\s+class=['"]\s*desc1\s*['"]\s*>((.|\n)*)<\/td>/i

tested with: http://www.pagecolumn.com/tool/pregtest.htm

edit: improved solution suggested by Alan Moore

/<td\s+class=['"]\s*desc1\s*['"]\s*>(.*?)<\/td>/s
Giulio Pulina
  • 344
  • 4
  • 15
  • Use the `/s` modifier and you won't have to add the `|\n`. You should use a reluctant quantifier, too: `(.*?)`. Otherwise, you'll match everything up to the very *last* `` in the document. It works okay in the example only because the target `` element happens to **be** the last one. – Alan Moore Dec 26 '10 at 23:47
  • yeah you're right :) i edited the post following your suggestions – Giulio Pulina Dec 27 '10 at 22:13