Can you web scrape with css formatting?

Question

I want to make a python bot that can interact with Symbolab. Here is an example. I have tried using the requests library and an example of the HCTI library to render the page as an image. Whenever I do this, the page looses its formatting. I am new to web scraping but I presume this is due to the css not being rendered as I was just grabbing the html. Is there I way that I can save an image file of a site like Symbolab in a way that renders the page like a web browser (all of the equations are readable etc)?

score 0 · Answer 1 · answered Apr 28 '22 at 17:15

You are correct that the css is not rendered. When you use the requests library, you just get what you get for. If you look at symbolab's page, their css is found in <link href="/public/auto/main.min.css?110025" rel="Stylesheet" type="text/css"> inside the head of the html of the page.

If you want to use HCTI (which I assume is https://htmlcsstoimage.com/?), it looks like they accept an html parameter as well as a separate css parameter. So you could just have another request to https://www.symbolab.com/public/auto/main.min.css?110025 to get the CSS and use that with HCTI.

But this is only assuming there is no other CSS reference on their page and that this URL doesn't become invalid. To resolve this, you could scrape the html you received for CSS references and always get the most up to date links.

An easier solution might be to just use Selenium to programmatically control a browser, which will do all the rendering like if you were on a regular browser. Then you can take a screenshot of the page using Selenium still. Or even a picture of a specific element. See this answer

Hope this helps.

Can you web scrape with css formatting?

1 Answers1