I'm use CrawlSpider and have rule defined but after start_url spider goes to the last page instead of second page. Why is this happen and how to write rule to follow pages in correct order 2,3,4... etc.
class MySpider(CrawlSpider):
name = "spidername"
allowed_domains = ["example.com"]
start_urls = [
"http://www.example.com/some-start-url.html",
]
rules = (
# Extract links from the page
Rule(SgmlLinkExtractor(allow=('/Page-\d+.html', )), callback='parse_links',follow=True),
)
Targeted site has little strange pagination but defined rule find all existing pages.