Get contents of a div from a URL

Question

Possible Duplicate:
How to implement a web scraper in PHP?
How to parse and process HTML with PHP?

I need to crawl through a page and get the contents of a particular div. I have php and javascript as my two main options. How can it be done?

have you possibly thought of perl and [WWW-Mechanize](http://search.cpan.org/dist/WWW-Mechanize/)? — cctan, Feb 01 '12 at 09:05

score 3 · Answer 1 · answered Feb 01 '12 at 09:27

There are many ways to get the contents of an url:

First Method:

http://simplehtmldom.sourceforge.net/

 Simple HTML DOM Parser

Second Method :

<?php

  $contents = file_get_contents("http://www.url.com");
  $contents = strip_tags($contents, "<div>");
  preg_match_all("/<div/>(?:[^<]*)<\/div>/is", $contents, $file_contents);

?>

Third Method:

`You can use jquery like Selectors :`

http://api.jquery.com/category/selectors/

score 2 · Accepted Answer · answered Feb 01 '12 at 09:07

This is quite a basic method to do it PHP and it returns the content in plain text. However you might consider revising the regex for your particular need.

<?php
  $link = file_get_contents("http://www.domain.com");
  $file = strip_tags($link, "<div>");
  preg_match_all("/<div/>(?:[^<]*)<\/div>/is", $file, $content);
  print_r($content); 
?>

score 2 · Answer 3 · answered Feb 01 '12 at 09:09

2

You can use SimpleDomParser as documented here http://simplehtmldom.sourceforge.net/manual.htm it requires PHP5+ though, but the nice thing is you can find tags on an HTML page with selectors just like jQuery.

answered Feb 01 '12 at 09:09

jerjer

8,694
30
36

score 1 · Answer 4 · answered Feb 01 '12 at 09:05

Specifically with jQuery, if you have a div like the following:

<div id="cool_div">Some content here</div>

You could use jQuery to get the contents of the div like this:

$('#cool_div').text(); // will return text version of contents...
$('#cool_div').html(); // will return HTML version of contents...

If you're using PHP to generate the content of the page, then you should be able to get a decent handle on the content and manipulate it even before it's returned to the screen and displayed. Hope this helps!

score 0 · Answer 5 · answered Feb 01 '12 at 09:06

0

Using PHP, you can try the DOMDocument class and the getElements() function

answered Feb 01 '12 at 09:06

Franquis

743
1
5
17

Get contents of a div from a URL

5 Answers5