How do I save an HTML page with all styles and images in C#? I need to make a programmatic implementation of a browser's 'Save' feature which doesn't rely on Internet Explorer (WebBrowser component).
-
possible duplicate of http://stackoverflow.com/questions/729355/save-webpage-using-webbrowser-control – Nick Craver Apr 16 '10 at 13:53
-
@Nick Craver : C# is not the same as VB6. Also this one wants to exclude webbrowser control – Oskar Kjellin Apr 16 '10 at 13:55
-
@Oskar - Same answer though, just pinvoke in C#: http://www.pinvoke.net/default.aspx/urlmon.urldownloadtofile – Nick Craver Apr 16 '10 at 14:01
-
@Nick Craver : That still relies on IE – Kristina Apr 16 '10 at 14:08
-
True. But is this really what he wants? The problem he is facing(as I understood it) is getting all the styles and picture. Not the downloading itself. – Oskar Kjellin Apr 16 '10 at 14:09
2 Answers
I do not think this is very easy.
Download all the HTML for the page using webclient and write the text to an HTML-file. Then use an html-parser to find all linked images and save them in their sub-directory. Do the same for the CSS.
If you do not want to save all the images you can just add the URL of the page to the beginning of all links to images. Also, note that some URL:s are not relative and you will have to compensate for that. And don't forget to scan the css-file for all linked images

- 21,280
- 10
- 54
- 93
I have a similar thing to solve. Biggest problems for you will be the images that come from CSS, they are very difficult to parse.
So, I chose to use FiddlerCore to achieve that.
Might help you too.
The difficult part of your task is to create your own structure, and change image paths accordingly.

- 570
- 4
- 20