-1

I am trying to get the contents of a div from the id name.

Here is the div I am trying to get: <div id="article-body"> ... </div>

However, this is on another external website so it has to be called with a www or an http:// etc...

I am sure it's possible. Just not sure if I should use PHP, DOM or jQuery etc..

I am thinking this code should be possible to do in a few lines. Just don't know what is the best method. Thanks for the tips or ideas.

UPDATE: It was suggested this was a duplicate question. It is not. I have used the code from the suggested duplicate question below and it does not work.

Here is one of the errors: Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: ID changeRegionForm already defined in Entity, line: 85 in /home/content/w/i/s/wisdom33/html/testing/getDivExternalWebsite.php on line 14

Here is a link to the code: http://massmediamail.com/testing/getDivExternalWebsite.php

Here is the code:

<html>

<body>
<?
$doc = new DomDocument;

// We need to validate our document before refering to the id
$doc->validateOnParse = true;
$doc->loadHtml(file_get_contents('http://www.lifesitenews.com/news/second-madagascar-archbishop-criticizes-catholic-relief-services-full-trans'));

var_dump($doc->getElementById('article-body'));
?>
</body>
</html>
Papa De Beau
  • 3,744
  • 18
  • 79
  • 137
  • 1
    Possible duplicates: http://stackoverflow.com/questions/5045598/getting-elements-of-a-div-from-another-page-php http://stackoverflow.com/questions/26947/how-to-implement-a-web-scraper-in-php http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-with-php – Amal Murali Sep 09 '13 at 20:07
  • Wait... are you trying to fetch contents from external site and then put them inside a DIV with ID=article-body in your page, or you want to fetch the contents of a DIV with ID=article-body that is actually on another site? – LexLythius Sep 09 '13 at 20:10
  • Just need the contents. And then I will package them in a div but with my own css styles etc. – Papa De Beau Sep 09 '13 at 20:11
  • @PapaDeBeau Sorry, I still don't understand: where does DIV#article-body exist, in your page or in the external source? (Not its contents, the DIV element itself.) This is important because the latter would imply *parsing* the HTML you get from the external source – LexLythius Sep 09 '13 at 20:15
  • @Amal its not a duplicate because I tried the code from one of those sites you suggested and I got this error: Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: ID changeRegionForm already defined in Entity, line: 85 in /home/content/w/i/s/wisdom33/html/testing/getDivExternalWebsite.php on line 14 – Papa De Beau Sep 09 '13 at 20:16
  • Yes the DIV called "article-body" is on an external website. I want to get this content and use it in my site. – Papa De Beau Sep 09 '13 at 20:17
  • @PapaDeBeau You should frame your question around the error you are getting with your attempt and show your relevant code. – danronmoon Sep 09 '13 at 20:21
  • @danronmoon, I made a change in the question based off of your suggestion. Thanks – Papa De Beau Sep 09 '13 at 20:25

2 Answers2

6

The answer is in http://simplehtmldom.sourceforge.net/

And once you download the code from the above url and place it in the appropriate place and link to it correctly then this code at the bottom works just perfect for what I asked above.

<?php

// Note you must download the php files from the link above 
// and link to them on this line below. 
include_once('../simple_html_dom.php');

$html = file_get_html('http://www.lifesitenews.com/news/second-madagascar-archbishop-criticizes-catholic-relief-services-full-trans');
$elem = $html->find('div[id=article-body]', 0);
echo $elem;

?>
Papa De Beau
  • 3,744
  • 18
  • 79
  • 137
1

This is wrong:

 <div> id="article-body" </div>

should be

 <div id="article-body"></div>

I guess you want to use jQuery.ajax to load contents from other source.

Since you want a specific part of the data inside some external URL (the contents of DIV#article-body), you will have to parse the full external content to get just what you need.

I would try jQuery's parseHTML.

LexLythius
  • 1,904
  • 1
  • 12
  • 20
  • Sorry that was an error as I wrote the question. Fixed this. Thanks – Papa De Beau Sep 09 '13 at 20:12
  • And with this I can get a specific div? Ok I will try. Thanks – Papa De Beau Sep 09 '13 at 20:27
  • Note that this is the client-side approach, and you will have to watch out for issues with the [same-origin policy](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Same_origin_policy_for_JavaScript), which you may be able to circumvent by using JSONP (in jQuery, dataType: "jsonp"). – LexLythius Sep 09 '13 at 20:33
  • The other alternative is to use your own server as a reverse proxy, fetching these contents using [cURL](php.net/manual/en/book.curl.php‎). But this is much heavier on your server. – LexLythius Sep 09 '13 at 20:35
  • Thanks LexLythius, not sure what your comment about same-origin policy means.... but I will look into curl. I don't think the jQuery will work but I have not tried it yet. – Papa De Beau Sep 09 '13 at 20:36
  • For security reasons, browsers (and sometimes the sites themselves) impose restrictions on what you can access via AJAX requests. Read [this](en.wikipedia.org/wiki/Same-origin_policy). Server-side is not subject to these issues, but having your server make a remote request (using cURL) for each client request is probably a handicap you need to consider. – LexLythius Sep 09 '13 at 20:45