How can I get the title of an HTML page using php? I've made a php web crawler and I want to implement this feature into my crawler so that it will have the name of the page and the url. Thanks in advance. Possibly using preg_match.
Asked
Active
Viewed 1.1k times
3
-
2Well how does your crawler work? – BoltClock Feb 06 '11 at 16:34
-
It parses the links and goes to each link, but that is not what I want, I want to parse the HTML page and figure out the title of the page – Feb 06 '11 at 16:36
-
1possible duplicate of [Best methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html) – Gordon Feb 06 '11 at 16:39
-
You should search [scraping PHP](http://stackoverflow.com/search?q=scraping+php), more then enough information already available, but I picked one with [a lot of votes](http://stackoverflow.com/questions/34120/html-scraping-in-php). – Alfred Feb 06 '11 at 16:42
-
1Example how to do it with DOM: [crawling a html page using php?](http://stackoverflow.com/questions/3946506/crawling-a-html-page-using-php/3955436#3955436) – Gordon Feb 06 '11 at 16:42
1 Answers
10
Would this help?
$myURL = 'http://www.google.com';
if (preg_match(
'/<title>(.+)<\/title>/',
file_get_contents($myURL),$matches)
&& isset($matches[1] )
$title = $matches[1];
else
$title = "Not Found";
-
This function is not working for some urls. for example : http://www.marketwired.com/press-release/advanced-cannabis-solutions-announces-definitive-agreement-to-acquire-property-otcqb-cann-1866064.htm – ARUN Mar 29 '14 at 10:29
-
This can help: http://stackoverflow.com/questions/399332/fastest-way-to-retrieve-a-title-in-php – shasi kanth Apr 07 '15 at 10:26