0

I have coded the following quick example to learn some of the basics of requests and Beautifulsoup.

import requests
from bs4 import BeautifulSoup
requests
url = 'http://www.tagesschau.de'
r = requests.get(url)
r_html = r.text
soup = BeautifulSoup(r_html, 'html.parser')
soup_prettified = soup.prettify()

with open('text_test_1.html','w') as open_file:
    open_file.write(soup_prettified.encode('ascii', 'replace'))

Everything works fine, but when I open the HTML, it does really not look like the original webpage. It is more a list of links. Why is that? How can I really have like a picture of the original webpage?

It is not the same question as the one marked as duplicate, since I don't want to save just the HTML.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Peter Series
  • 289
  • 3
  • 9
  • could you show us a snippet of your output? and if you want a picture of the webpage you could always right click -> save as. or is that not what you are asking? – MattR Sep 22 '16 at 16:21
  • First off `.encode('ascii', 'replace')` is a terrible idea, almost certainly going to give you broken html. Second, the page has Javascript etc.. that is rendering content so you are not going to see the same content when you save the pure source – Padraic Cunningham Sep 22 '16 at 16:31
  • What are you trying to do? It looks like you are trying to download the webpage from a link, pretty print it to neatly format it and then save it to a file. Do you wish to 'screenshot' the webpage at that link? I'm not sure if that is possible with python. – scotty3785 Sep 22 '16 at 16:32
  • You can use [webkit2png](http://stackoverflow.com/questions/1197172/how-can-i-take-a-screenshot-image-of-a-website-using-python) if you want a screenshot but it isn't python code. – scotty3785 Sep 22 '16 at 16:34
  • My idea is simply to find a way to crawl any webpage and make it look on my local machine as if it was online (but it actually isn't). Your comments are very helpful to me, I try to work on them and find my error somehow. - Just taking actually a picture doesn't really solve it: Take the case that you have a 100 page long tutorial you would like to save. Than you just want to save it with one click and not manually take 100 pictures. – Peter Series Sep 22 '16 at 16:35
  • You can just save as a pdf, there are numerous plugins that will do it for you . It would be reasonably easy to integrate that into code so you could automate it – Padraic Cunningham Sep 22 '16 at 17:00

0 Answers0