0

I am trying to extract the title text from an html page and insert it into an object. I am using symphony and php. The result from filterXPATH does not seem to be plain text and instead it is the entire html page and throwing error. I don't know why.

My code is:

$html =  $this->file_get_contents_curl("http://www.google.com/");
$urlData = [];
$crawler = new Crawler($html);
$urlData->title = $crawler->filterXPath('//title')->extract('_text');

I see the title text if I do:

return $crawler->filterXPath('//title')->extract('_text');
voytek
  • 2,202
  • 3
  • 28
  • 44
Sanjoy
  • 135
  • 1
  • 3
  • 12

1 Answers1

0

Try this,

libxml_use_internal_errors(true);
$html =  file_get_contents("http://www.google.com/");
$dom1 = new DOMDocument;
$dom1->preserveWhiteSpace = false;
$dom1->loadHTML($html);
$xp = new DOMXPath($dom1);
$xp->registerNamespace("php", "http://php.net/xpath");
$urlData= $xp->query('//title');
foreach($urlData as $title) {
echo $title->textContent;
}
Learning
  • 848
  • 1
  • 9
  • 32
  • I need to assign $title (which should be plain text string) into $data->title. When I try to do it it is giving the same error. Appreciate the help – Sanjoy Jul 29 '15 at 14:18