1

I have a single diagnostic webpage on a device with charts that is in XML format made up of an xsl and gif files. Is there a way with Python to download the entire page and save it as a single .mht file rather than separate files?

Matt
  • 57
  • 1
  • 2
  • 9

2 Answers2

0

This is essentially a combination of those two problems:

AFAIK, you could download the page with urllib, parse the HTML with Beautiful Soup, find the images and other dependencies in the parsed HTML, download those, rewrite the image urls in the parsed html to point to the local copies (Beautiful Soup can do this), save the modified HTML back to the disk, and use MHTifier to generate the MHT.

Perhaps Scrapy could help you, too.

Community
  • 1
  • 1
Haroldo_OK
  • 6,612
  • 3
  • 43
  • 80
0

Hi I was able to convert html page from web page and local html to .mht using win32com. You can have a look at this https://stackoverflow.com/a/59321911/5290876.

You can share sample xml with xsl with images for testing.

Chetan
  • 644
  • 6
  • 7