2

From my understanding, there are two types of source code (Generated Source code VS Source Code per page (as describe in here What is the difference between "Source" and "Generated Source"?).

When I use the PHP Simple HTML DOM Parser (http://simplehtmldom.sourceforge.net/), I notice that I could only get the Source Code.

How do I get the Generated Source code?

If it is not possible using the PHP Simple HTML DOM Parser , are there other ways using PHP to get the Generated Source code? (Optional)

If it is not possible using PHP to get the Gebnerated Source code, are there other ways using javascript to get it? (Optional)

Updates 1: With reference to the answer made by user Shankar Damodaran, I need to change my understanding that there are three types of source code as follows:

  • Actual Source Code (e.g. PHP, ASPX. Usually applies to server-side scripts)

  • Source Code (The source code before javascript and css is applied)

  • Generated Source Code (the source code after javascript and css is applied)

Community
  • 1
  • 1
user275517
  • 185
  • 1
  • 1
  • 10

2 Answers2

2

You can't via PHP alone, you have to rely on either Selenium or Phantom.js, which are headless browsers that will render the page and return you the HTML structure you are looking for.

moonwave99
  • 21,957
  • 3
  • 43
  • 64
  • you can also find few more additional tools here http://stackoverflow.com/a/125256/1189040 – Himal Apr 14 '14 at 11:17
0

I think you misunderstood ..

Source Code is Interpreted by the server
Generated Source Code is the one returned to the browser. (Just the HTML stuff)

What the HTML DOM parser does is that they play with the Generated Source Code and not with the actual Source Code.

To answer your questions...

How do I get the Generated Source code?

You cannot get the Actual Source Code , unless by illegal means.

Community
  • 1
  • 1
Shankar Narayana Damodaran
  • 68,075
  • 43
  • 96
  • 126
  • 1
    I think he is referring to the HTML source NOT the other also by generated source code he meant the page after running JavaScript etc. – Himal Apr 14 '14 at 11:08
  • Yeah, the thing is html response vs. DOM after the initialisation of the client side app. – moonwave99 Apr 14 '14 at 11:08