1

I need good web crawler written in Python to store complete page into mysql database. Small system that I am experimenting uses now PHP Sphider to crawl and store into database. I need something that works almost exact like sphider, but writen in Python. So just storing database to into table where from other scripts taking content and doing the rest of job that I need. Sphider is slow, and want to replace it.

So, I look at scrapy and some other projects but anything didn't feet in my needs, this is my last try before I start coding myself, so if someone know what can solve me this problem please tell me.

Kara
  • 6,115
  • 16
  • 50
  • 57
Sam
  • 21
  • 2
  • is there a reason why you can't use scrapy and then over-ride the save functions to put data into a mysql database. You could even use ORMs like SQLAlchemy to make it easier to save and retrieve info. Perhaps if you told us why scrapy is insufficient then we can be of more help. – JudoWill Oct 26 '10 at 14:50
  • http://scrapy.org/ should do what you are looking for – ScraperWiki Oct 26 '10 at 10:08

1 Answers1

0

BeWARE!

This answer is tailored for beginners it is NOT the optimal or the most clever.

But for you I highly recommend scrapy. Try the tutorial. And remember to use Firefox + Firebug extension for you to navigate and learn the inner paths, xpaths and html locations of your data for posterior parser.

Check similar answers "Going from Ruby to Python crawlers" and "Python read my outlook email mailbox and parse messages"

Save your time and use Firefox with the FireBug extensions (enable the inspect)

Community
  • 1
  • 1
Carlos Henrique Cano
  • 1,458
  • 11
  • 15