I would like to download a .html page with scanned text images just as I can download it via:
browser -> right click -> Save Page As ... using C#.
I have tried 3 different methods:
1. and 2. from here:
How can I download HTML source in C#
3. from here:
Get HTML code from website in C#
I have tried saving the file as suggested here:
Creating a file (.htm) in C# or using
System.IO.File.WriteAllText(@"C:xy.html", htmlSourceString);
My problem is that when I open the downloaded file, the text on the images are automatically extracted into html paragraphs, and the images are lost.
How can I disable this transoformation option?
UPDATE
Thank you for your reply! Now I understand that I have to download the images individually.
But I'm still curious: Why is this transformation happening?
I have made a pic to demonstrate what I'm exactly talking about.
click for the pic