Easiest way to scrape webpages to save to .csv

Question

There is a page I want to scrape, you can pass it variables in the URL and it generates specific content. All the content is in a giant HTML table.

I am looking for a way to write a script that can go through 180 of these different pages, extract specific information from certain columns in the table, do some math, and then write them to a .csv file. That way I can do further analysis myself on the data.

What is the easiest way to scrape webpages, parse HTML and then store the data to a .csv file?

I have done stuff similar in python and PHP, the parsing of HTML is not the most easiest thing to do, or cleanest. Are there other routes that are easier?

Web scraping is **not data-mining**. It's at most "information extraction". or, well, web scraping. Please don't overtag everything as "data mining" that doesn't include databases and analysis... — Has QUIT--Anony-Mousse, Mar 21 '12 at 20:56
This is a pretty idiosyncratic question, because your personal skill with different languages is going to make a big difference here - if you're a Python expert, than Python-based tools are going to be easier. You could make the question more useful to yourself and others by specifying the language you want to use. — nrabinowitz, Mar 22 '12 at 17:03

score 1 · Answer 1 · edited May 23 '17 at 12:27

1

If you have some experience with python, I would recommend something like BeautifulSoup, or in PHP you can use PhPQuery.

Once you know how to use the HTML-parser, then you can create a "pipes-and-filter" program to do the math and dump it to a csv file.

Have a look at this question for more info on a Python solution.

edited May 23 '17 at 12:27

Community

1
1

answered Mar 21 '12 at 18:39

ebaxt

8,287
1
34
36

Easiest way to scrape webpages to save to .csv

1 Answers1