0

I send request to website for get content by file_get_contents e.g.

$html=file_get_contents('http://....');
var_dump(HTML::encode($html));

but html body tag fill by js so I cant get body. body is like this

<body> </body>

How can get body by php

Mahdi Bahari
  • 109
  • 1
  • 10
  • You could use [SimpleXML](http://php.net/manual/en/simplexml.examples-basic.php) or (please dont do this next one) you could use a string matching solution, like regex, or explode(openBody tag) -> explode closeBody tag. – scagood Mar 06 '18 at 09:26
  • `HTML::encode` only converts characters e.g. `©` goes to `©`. Have you tried anything else? – scagood Mar 06 '18 at 09:29
  • Possible duplicate of [Can simplexml be used to rifle through html?](https://stackoverflow.com/questions/6635849/can-simplexml-be-used-to-rifle-through-html) – scagood Mar 06 '18 at 09:29
  • don't return any body because js load body when php send request and return back any js dont loaded – Mahdi Bahari Mar 06 '18 at 09:31
  • 1
    https://stackoverflow.com/a/1770607/3533202 – scagood Mar 06 '18 at 09:31
  • If the main content is loaded asynchronously via JS, you'll have to parse and execute that JS in order to get all the page contents first. – feeela Mar 06 '18 at 09:31
  • 2
    Possible duplicate of [Non-browser emulation of JavaScript - is it possible?](https://stackoverflow.com/questions/1768717/non-browser-emulation-of-javascript-is-it-possible) – feeela Mar 06 '18 at 09:32
  • Possible duplicate of [PHP: how can I load the content of a web page into a variable?](https://stackoverflow.com/questions/3249157/php-how-can-i-load-the-content-of-a-web-page-into-a-variable) – Tobok Sitanggang Mar 06 '18 at 09:34

2 Answers2

2

You can use tools specifically designed for this purpose.

A popular solution is Symfony's Panther library.

Given the page you are trying to get content for is hosted at http://example.com, and an element with the id "myElement" is added to the page using javascript (indicating the javascript we are dependent on has finished executing), we could run the following code:

$client = \Symfony\Component\Panther\Client::createChromeClient();
$crawler = $client->request('GET', 'http://example.com');
$client->waitFor('#myElement');
var_dump($crawler->html());
Jonathan
  • 1,041
  • 1
  • 12
  • 21
-2

if the target website contents are being populated by the script then you cannot access it via above method as there is no area provided to execute the script to populate the body when you do a PHP call like above. alternatively, you may use ajax to get the target website content which also will have restrictions based on origin/request which only possible if you have access to the target website or you can use an iframe and I don't know which is suitable for what you really need to accomplish anyway?

fayis003
  • 680
  • 4
  • 10
  • I cant use ajax or iframe because Im going to write crawler and there are not any api and i`m force to use this way for e.x login page and get data by php but js run in front in browser that fill and me request by php or other language such as python – Mahdi Bahari Mar 06 '18 at 11:19
  • For that you may implement something like nodejs to run the website and to get final html result populated by scripts – fayis003 Mar 06 '18 at 11:27