Get text with PHP Simple HTML DOM Parser

Question

i'm using PHP Simple HTML DOM Parser to get text from a webpage. The page i need to manipulate is something like:

<html>
<head>
<title>title</title>
<body>
<div id="content">
<h1>HELLO</h1>
Hello, world!
</div>
</body>
</html>

I need to get the h1 element and the text that has no tags. to get the h1 i use this code:

$html = file_get_html("remote_page.html");
foreach($html->find('#content') as $text){
echo "H1: ".$text->find('h1', 0)->plaintext;
}

But the other text? I also tried this into the foreach but i get the full text:

$text->plaintext;

but it returned also the H1 tag...

Why do you expext the `plaintext` member to return something else? — hakre, Mar 24 '12 at 18:14
I guess so, but I can't recommend Simple HTML DOM Parser, but just [`DOMDocument`](http://php.net/DOMDocument). It would be `->nodeValue` then. — hakre, Mar 24 '12 at 18:51
I'm with the same problem, I want to extract the text after a tag that is not within tags... — David, Apr 07 '14 at 15:48

score 0 · Answer 1 · edited Dec 14 '16 at 04:02

0

You can simply strip html tags using strip_tags

<?php
strip_tags($input, '<br>');
?>

edited Dec 14 '16 at 04:02

jrbedard

3,662
5
30
34

answered Dec 14 '16 at 03:41

Peachy

1

Why would you exclude the `
` tag? The OP said that all tags need stripped. – NonCreature0714 Dec 14 '16 at 04:01
you can leave that blank. – Peachy Dec 14 '16 at 06:26

score 0 · Answer 2 · answered Dec 14 '16 at 04:05

Use strip tags, as @Peachy pointed out. However, passing it a second argument <br> means string will ignore <br> tags, which is unnecessary. In your case,

<?php
    strip_tags($text);
?>

would work as you'd like, given that you are only selecting content in the content id.

score 0 · Answer 3 · answered Jun 24 '21 at 10:14

0

Try it

echo "H1: ".$text->find('h1', 0)->innertext;

answered Jun 24 '21 at 10:14

Malleron

1

score 0 · Answer 4 · answered Mar 24 '12 at 19:00

0

It looks like $text->find('text',2); gets what you're looking for, however I'm not sure how well that will work when the amount of text nodes is unknown. I'll keep looking.

answered Mar 24 '12 at 19:00

Korvin Szanto

4,531
4
19
49

Get text with PHP Simple HTML DOM Parser

4 Answers4

Linked