3

How can I get the title of an HTML page using php? I've made a php web crawler and I want to implement this feature into my crawler so that it will have the name of the page and the url. Thanks in advance. Possibly using preg_match.

  • 2
    Well how does your crawler work? – BoltClock Feb 06 '11 at 16:34
  • It parses the links and goes to each link, but that is not what I want, I want to parse the HTML page and figure out the title of the page –  Feb 06 '11 at 16:36
  • 1
    possible duplicate of [Best methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html) – Gordon Feb 06 '11 at 16:39
  • You should search [scraping PHP](http://stackoverflow.com/search?q=scraping+php), more then enough information already available, but I picked one with [a lot of votes](http://stackoverflow.com/questions/34120/html-scraping-in-php). – Alfred Feb 06 '11 at 16:42
  • 1
    Example how to do it with DOM: [crawling a html page using php?](http://stackoverflow.com/questions/3946506/crawling-a-html-page-using-php/3955436#3955436) – Gordon Feb 06 '11 at 16:42

1 Answers1

10

Would this help?

$myURL = 'http://www.google.com';
if (preg_match(
        '/<title>(.+)<\/title>/',
        file_get_contents($myURL),$matches) 
    && isset($matches[1] )
   $title = $matches[1];
else
   $title = "Not Found";
greg0ire
  • 22,714
  • 16
  • 72
  • 101
Andreas
  • 5,305
  • 4
  • 41
  • 60
  • This function is not working for some urls. for example : http://www.marketwired.com/press-release/advanced-cannabis-solutions-announces-definitive-agreement-to-acquire-property-otcqb-cann-1866064.htm – ARUN Mar 29 '14 at 10:29
  • This can help: http://stackoverflow.com/questions/399332/fastest-way-to-retrieve-a-title-in-php – shasi kanth Apr 07 '15 at 10:26