10

Possible Duplicates:
Get title of website via link
How do I extract title of a website?

How can a website's title be grabbed using PHP DOM? (Which is the best way to grab it using PHP?)

Community
  • 1
  • 1
john
  • 175
  • 1
  • 5

2 Answers2

18

You can use the getElementByTagName() since there is technically only a single title attribute in your html so you can just grab the first one you come across in DOM.

$title = '';
$dom = new DOMDocument();

if($dom->loadHTMLFile($urlpage)) {
    $list = $dom->getElementsByTagName("title");
    if ($list->length > 0) {
        $title = $list->item(0)->textContent;
    }
}
John Cartwright
  • 5,109
  • 22
  • 25
7

Suppresses any parsing errors from incorrect HTML or missing elements:

<?

$doc = new DOMDocument();
@$doc->loadHTML(@file_get_contents("http://www.washingtonpost.com"));

// find the title
$titlelist = $doc->getElementsByTagName("title");
if($titlelist->length > 0){
  echo $titlelist->item(0)->nodeValue;
 }
Femi
  • 64,273
  • 8
  • 118
  • 148
  • `loadHTMLFile` already incorporates file_get_contents and does not give errors on malformed HTML so any errors it does produce would be valuable. `loadHTML` also does not give errors on malformed HTML. – Erik May 03 '11 at 13:23
  • Well, when I use `$doc->loadHTMLFile("http://www.washingtonpost.com"); ` right now I get a bunch of errors that say *Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: htmlParseEntityRef: expecting ';' in http://www.washingtonpost.com, line: 52 in /var/www/test/test2.php on line 5*. Maybe its my PHP version, but... – Femi May 03 '11 at 13:36
  • 1
    Your right - my apologies. I was remembering that it will parse it, but it does indeed show warnings. However, the @ suppression method is still a bad choice. You'd be better off setting `libxml_use_internal_errors(true);` so you could access the error data if you wanted/needed to – Erik May 03 '11 at 13:42
  • Point: this was quick and dirty. It IS a little lazy using the error suppression, and I wasn't aware it was using libxml under the hood. – Femi May 03 '11 at 13:44