After some Google search, I did not find anything fill my need. I want to save the current web page just as what it is. I mean, many web pages has Javascript executed and CSS changed, so after some user interactive, the web page may be different from the one when it is firstly loaded into browser. And I want to save the current web page state to the sever and rendering it in the server. Is there any Javascript library for this task? Thanks!
4 Answers
Even simpler:
var serialized = document.documentElement.innerHTML
outerHTML
instead of innerHTML
would be better, but it doesn't work in Firefox.
Let's test it.
>>> document.body.style.color = 'red'; >>> document.documentElement.innerHTML ... <body style="color: red;"> ...

- 14,907
- 16
- 65
- 104
-
I think the problems for this solution is also, when some elements' style got changed by Javascript, inner/outterHTML can't reflect that changes... – Yang Bo Jan 09 '10 at 08:40
-
I run 'document.body.color='red';document.documentElement.innerHTML;' in Chrome console, and I got ''. Did you use Firefox or IE? – Yang Bo Jan 09 '10 at 09:13
-
`document.body.style.color`, not `document.body.color`. My bad. – NVI Jan 09 '10 at 10:06
-
textarea.value can not be serialized – CS QGB Mar 06 '20 at 07:16
I'm working on something rather similar and wanted to share a summary of what I'm noticing with the innerHTML in IE8, FF3.6, and CHROME 5.0
IE
- Strips the quotes from around many of the element attributes
- Singleton nodes aren't self closed
- If the values on the elements change after the HTML has been loaded, it picks up the new values
FF, CHROME
- Singleton nodes aren't self closed
- If the values on the elements change after the HTML has been loaded, it does NOT pick up the new values. It only picks up the default values set in the HTML upon initial rendering.

- 21
- 1
Serializing a complete web page is as simple as:
var serialized = document.body.innerHTML;
If you really need the full document, including the head, then:
var serialized =
'<head>' +
document.getElementsByTagName('head')[0].innerHTML +
'</head><body>' +
document.body.innerHTML +
'</body>';
Now all you need to do is submit it via AJAX.
About server side rendering, it depends what you mean by rendering. I'm currently using wkhtmltopdf to implement a 'save as pdf' feature on my site. It uses webKit to render the HTML prior to generating the PDF so it fully supports CSS and javascript.
And if you need to save it to an image instead of a PDF file you can always use ghostscript to print the PDF to a JPG/PNG file.

- 109,858
- 19
- 140
- 171
-
1Ah, but does element.innerHTML contains the style information of the element? – Yang Bo Jan 09 '10 at 06:58
-
-
But, what if after some user interactivity, some elements' style has been changed by javascript? This is really annoying... – Yang Bo Jan 09 '10 at 08:33
-
Then just do the innerHTML **after** the user interactivity. innerHTML is a sort of reference to the browser's HTML compiler/parser. It is not the same as view source. – slebetman Jan 09 '10 at 10:20
-
wkhtmltopdf can convert any web page to a PDF file, is there any similar tool which can convert a web page into a PNG/JPG image? – Yang Bo Jan 10 '10 at 03:20
-
Not that I'm aware of. However you can use ghostscript to 'print' the PDF file to JPG: `gs -sDEVICE=jpeg -o out-image.jpg webpage.pdf` – slebetman Jan 10 '10 at 03:43
-
@slebetman , suppose that after we saved the html of the page for user, how can we load it back, can you post a script for that ? – mostafa khansa Sep 23 '13 at 08:13
-
What do you mean "load it back". A serialized web page is just a string. The OP wants to send the page "as seen by the user" (not original page source) back to his server so the string can simply be sent back using AJAX. On the server that string (html page) is simply saved to a file. To "load it back" simply open that file in a web browser. – slebetman Sep 23 '13 at 08:35
-
To "load back" a page to the USER instead of the developer/admin/hacker/spy you should not use this technique as the page is effectively "dead" - the javascript that runs on the page will/may be confused because the page may have a DOM format that is not expected and all event handlers added via javascript will have been lost. Instead, serialize the relevant DATA on the page using a form or simply send back the relevant javascript objects as JSON and have the data reloaded when the user revisits the page. – slebetman Sep 23 '13 at 08:39
-
this post mentions a serialiseWithStyles() function, the function calculates styles for each element, and print the styles inline. this eliminate need for separate stylesheets.
then to submit it to a server, send a post request. use ajax or a plain form.

- 269
- 2
- 8
-
i think that question is a dup of this question. however that answer in that question is good. – 把友情留在无盐 Mar 23 '15 at 13:13