I have this code which gets the HTML source of a page:
$page = file_get_contents('http://example.com/page.html');
$page = htmlentities($page);
I want to scrape some content from it. For example, say the page's source contains this:
<strong>technorati.com</strong><br />
Connection failed<br /><br />Pinging <strong>icerocket.com</strong><br />
Connection failed<br /><br />Pinging <strong>weblogs.com</strong><br />
Done<br /><br />Pinging <strong>newsgator.com</strong><br />
Done<br /><br />Pinging <strong>blo.gs</strong><br />
Done<br /><br />Pinging <strong>feedburner.com</strong><br />
Done<br /><br />Pinging <strong>blogstreet.com</strong><br />
Done<br /><br />Pinging <strong>my.yahoo.com</strong><br />
Connection failed<br /><br />Pinging <strong>moreover.com</strong><br />
Connection failed<br /><br />Pinging <strong>newsisfree.com</strong><br />
Done<br />
Is there a way I could scrape this from the source and store it in a variable, so it'll look like this:
technorati.com Connection failed
icerocket.com Connection failed
eblogs.com Done
Ect.
Of cause the page is dynamic which is why I'm having a problem. Could I maybe search for each site in the source? But then how would I get the result which is after it? (Connection failed / Done)
Thanks a lot for the help!
"coonection result"
, does this change sometimes? – CaNNaDaRk Sep 06 '11 at 14:26