3

As title, my question is how to output (lets say save as a text file on server computer or pass the result to some other php function using ajax) all DOM content on a page?

I did some homework, I tried curl can just output all DOM content using "curl http://google.ca > dom.txt" However, the this approach will not save contents that Javascript generated, in other words, the javascript code will not run. Another approach is to embed some javascript code into a page and let the page load the website we want to output, then use the javascript code to save all DOM file after everything is loaded.

I am not sure if phantom.js can do such job, if yes, then how to?

Any body can give a detailed answer on how to achieve this?

I am open to any solutions, this program will run on my server to provide service.

Thank you in advance.

cache
  • 1,239
  • 3
  • 13
  • 21
  • 4
    Why don't you think phantom.js can? I think that's exactly what it's for. – Explosion Pills May 25 '12 at 00:15
  • @iambriansreed: Yes, but do you really think "jQuery" is capable of that task? – Bergi May 25 '12 at 00:16
  • @Explosion Pills: I only tried phantom in command line to export a har file. Could you please give me some detailed guideline to achieve this? Much appreciated! – cache May 25 '12 at 00:17

3 Answers3

4

Why not:

jQuery(document).ready(function($) {
    $.post(
        '/your_filename.php',
        'html='+$("html").html(),
        function(response){
            alert(response);
        }
    );
});
iambriansreed
  • 21,935
  • 6
  • 63
  • 79
1

You can get the contents of the HTML element (including both head and body) using document.documentElement.innerHTML. If you need everything, you can concatenate document.doctype with document.documentElement.outerHTML.

Note that outerHTML isn't quite cross-browser (it works in IE and Chrome, but not Firefox) - for a way to simulate outerHTML for Firefox, see this question: How do I do OuterHTML in firefox?

Community
  • 1
  • 1
Brilliand
  • 13,404
  • 6
  • 46
  • 58
  • Thank you for your answer. Then how can you output this internal javascript object? – cache May 25 '12 at 00:25
  • Probably just go with a jQuery post, like iambriansreed suggested. My answer is a drop-in replacement for `$("html").html()` - particularly if you need the doctype and `` tag. – Brilliand May 25 '12 at 00:30
  • 1
    I'm not familiar with phantom.js, but after searching its documentation, you might be able to accomplish the same thing via the `WebPage.content` property. – Brilliand May 25 '12 at 00:44
0

Javascript is a client side language, so running it on a Server is going to require specialized technology. PHP actually has the ability to work with DOM stuff, as it can build and modify dom elements before transmitting to the client, read more about that here.

I'm not really sure what you are trying to accomplish by doing this, but it sounds like you are trying too hard: you are sending code to the client so that the client can turn around and send code back to the server so that the server can save it as a file? Although if that is what you need to do, follow Brilliand's and iambriansreed's advice to scoop up dom elements with Javascript/jQuery.

hypervisor666
  • 1,275
  • 1
  • 9
  • 17
  • Thanks for your answer. Everything is actually completed on server, even send code to "client", this client could be just a browser run the code in an X session. – cache May 25 '12 at 00:29
  • 1
    (phantom.js was *designed* to run "headless" or "on a server") –  May 25 '12 at 00:29
  • @pst I knew that running javascript on a server was possible (from reading about a programming language project called "Dart" --http://www.dartlang.org/), but never really saw the sense in using it because there are already quite a few robust server side languages, so I never really looked into it. – hypervisor666 May 25 '12 at 00:34