url = "http://" + str(input)
t = urllib.request.urlopen(url)
how can I save the sourcecode of any Website in an .txt file? I use python version 3
url = "http://" + str(input)
t = urllib.request.urlopen(url)
how can I save the sourcecode of any Website in an .txt file? I use python version 3
There are multiple ways you can get this done.
This can be done using any library of your choice, My personal favorite is requests, the code goes as follows
import requests
headers = {'User-agents':'Mozilla/5.0'}
html_data = requests.get('Your url goes here',headers=headers)
This code will store the object at a location, to get the data in text format you can use
html_data = html_data.text
file = open('your file path goes here','ab') //this will open the file you have specified in the path
file.write(html.text.encode('UTF-8')) //Most of the HTML pages are encoded in ascii, you need to convert it into 'UTF-8' encoding to write it into a txt file.
file.close() //Close the file. all the mishaps in the world will happen if you don't close the file which is opened
This will save all the html code from a website to the text file which you have mentioned in the path.
If you were explicitly referring to saving the visible data in the website, try using some parser library, I Recommend using BeautifulSoup.
Here are the links to the actual python documentations for the libraries used and recommended.
There are tons of videos and tutorials about this, but still:
import urllib
t = urllib.urlopen(url).read()
with open("c:\\source_code.txt",'w') as source_code:
source_code.write(t)
This is the quickest way:
import urllib.request
a = str(input())
url = "http://" + a
urllib.request.urlretrieve(url, 'page.txt')
Bear in mind the site may not always be http://
and input()
always takes ()