2

I explain my problem to you:
When I do PHP curls on some site or want to display the source code of the page element is missing a lot. I think some part is called by a script or something.
Could someone help me view the entire code with Curl PHP.


To duplicate my problem go to Facebook or LinkedIn and right click on the page and "View the source code of the page", in this you don't see all the page content but when for example you right click and "inspect an element" You can. Thank you in advance

  • 2
    As I understand it (which may turn out to be "not at all"...), your browser downloads the initial page code, which is the same as you receive using CURL. It then executes the code inside that page, which populates the other page elements, so I believe you'll have to simulate that. Does the site you are trying to access not offer an API to access the information you need? – droopsnoot Nov 08 '21 at 09:30
  • Thank you for your response, yes there is a access to an API i'l try to access it. – Adam Garchi Nov 08 '21 at 09:42

1 Answers1

0

CURL can't do this. It's not designed to render HTML or execute JavaScript.

A lot of the content on Facebook, LinkedIn, Twitter and many other pages is loaded through different ways. (like fetch()-requests or WebSocket-Events)

Some nodes you can see in the inspector are not part of the original document (which you are viewing with "view source" or curl downloads). What you see on the inspector is everything currently held in memory, which was partially (or completely) created with a scripting language.

This is basically done to

  • reduce the load on servers as it doesn't have to generate the whole page on every request
  • reduce traffic on clients and servers (no need to reload the header-data and/or scripts over and over again)

If you need data from a rendered site, you should either check if the website provides an API which gives you the data you are looking for or use one of the cli-rendering-engines from this answer.

Christopher
  • 3,124
  • 2
  • 12
  • 29