(using Python) How to save the text on a webpage into an Excel file?

Question

Every day I need to open a webpage, copy the text on the page and paste it into an Excel file. Is there a way that I can automate this process using Python, without bothering to open a web browser?

thanks for friends who provided the answer. would it be possible to show me an example?

thanks.

If the question is can you do this then the answer is yes, but the point of SO isn't to get other people to do the work for you. — Noelkd, May 31 '13 at 11:06

score 1 · Answer 1 · answered May 31 '13 at 11:13

1

Sure, simply use urllib2 to open your webpage, then have a look at the content with BeautifulSoup and then just stick that data into the Excel file with xlwt. Easy!

answered May 31 '13 at 11:13

danodonovan

19,636
10
70
78

thanks for the reply and the links, which are useful for study. – Thank you help me learn May 31 '13 at 11:44
Instead of using urllib2, you could try the excellent "requests" library. It handles much of the heavy lifting for you. http://docs.python-requests.org/en/latest/ – twasbrillig Oct 05 '14 at 04:30

score 1 · Accepted Answer · edited May 23 '17 at 11:51

1

You could use a technique called web scraping; there is even an open source framework written in python called scrapy which is specifically written for crawling and screen scraping.

Just do a google search with a search phrase such as; "web scraping using python" this should be enough to get you started on your way.

There is some good information in the following post; Anyone know of a good Python based web crawler that I could use?

edited May 23 '17 at 11:51

Community

1
1

answered May 31 '13 at 11:14

Nishan

157
1
12

this is direct, suitable for newbie like me :) – Thank you help me learn May 31 '13 at 11:42

score 1 · Answer 3 · answered May 31 '13 at 11:16

1

Yes, you can do this.

I would suggest:

Read up on urllib and urllib2 for getting the page in python.
Investigate lxml for parsing the content from your page.
Take a look at this page on python excel manipulation.
Attempt to write some code to do what you wish.
If you don't succeed immediately then ask for some help and provide code examples.

Good luck

answered May 31 '13 at 11:16

Chris Clarke

2,103
2
14
19

thanks for the details and links, and bullet points. professional! – Thank you help me learn May 31 '13 at 11:42

score 1 · Answer 4 · answered May 31 '13 at 11:53

1

You can do the same in excel itself at a small level (importing data to Excel from the web). From the Excel Ribbon select 'Data' > 'From Web. If you are bent upon using python try https://datanitro.com/ . Datanitro is an excellent python-excel integration. Here is a demo http://scriptogr.am/richie/post/python-for-excel-using-datanitro

answered May 31 '13 at 11:53

richie

17,568
19
51
70

another point of view. thanks. – Thank you help me learn Jun 01 '13 at 00:58
Unfortunately DataNitro isn't free, unless you're a student. It costs $99 otherwise. – twasbrillig Oct 05 '14 at 04:36

score 0 · Answer 5 · answered May 31 '13 at 11:13

0

Yes, there is. You need to use urllib2 to pull the HTML from the web, then you need to parse the HTML for the values you need (module BeautifulSoup and regex), and finally to save the result as CSV file, which can be opened in Excel

answered May 31 '13 at 11:13

Iliyan Bobev

3,070
2
20
24

(using Python) How to save the text on a webpage into an Excel file?

5 Answers5