Is there some good tutorial or sample to learn about http web scraping? How to start developing a tool that can search on some web sites and download specific information so I can collect it automatically and then analyse?? thanks!
Asked
Active
Viewed 128 times
1 Answers
2
A tool commonly recommended for this is the Html Agility Pack. It will take malformed HTML and massage it into XHTML and then a traversable DOM, so is very useful for the code you find in the wild, as opposed to approaches like RegEx, which are destined to break.
There are some examples and the API documentation here:
http://html-agility-pack.net/api
Some useful links:

wp78de
- 18,207
- 7
- 43
- 71

D'Arcy Rittich
- 167,292
- 40
- 290
- 283