2

I want to be able to use scrapy to crawl links on a sitemap. I don't know much about this application, so I would be interested in any links/info/documentation you could provide.

Thanks

JBlake
  • 1,004
  • 12
  • 29

2 Answers2

12

A new generic spider has just been added to Scrapy trunk, for this purpose. It will be available on next release (Scrapy 0.14)

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
Pablo Hoffman
  • 1,540
  • 13
  • 19
0

All of the documentation is at http://doc.scrapy.org/. The tutorials can be found at scrapy.org also.

As for your question, see this SO question: how to parse a sitemap.xml file using scrapy's XmlFeedSpider?

Community
  • 1
  • 1
marr75
  • 5,666
  • 1
  • 27
  • 41