Simply I need three conditions.
1) Log-in
2) Multiple request
3) Synchronous request ( sequential like 'C' )
I realized 'yield' should be used for multiple request.
But I think 'yield' works differently with 'C' and not sequential.
So I want to use request without 'yield' like below.
But crawl method wasn`t called ordinarily.
How can I call crawl method sequentially like C ?
class HotdaySpider(scrapy.Spider):
name = "hotday"
allowed_domains = ["test.com"]
login_page = "http://www.test.com"
start_urls = ["http://www.test.com"]
maxnum = 27982
runcnt = 10
def parse(self, response):
return [FormRequest.from_response(response,formname='login_form',formdata={'id': 'id', 'password': 'password'}, callback=self.after_login)]
def after_login(self, response):
global maxnum
global runcnt
i = 0
while i < runcnt :
**Request(url="http://www.test.com/view.php?idx=" + str(maxnum) + "/",callback=self.crawl)**
i = i + 1
def crawl(self, response):
global maxnum
filename = 'hotday.html'
with open(filename, 'wb') as f:
f.write(unicode(response.body.decode(response.encoding)).encode('utf-8'))
maxnum = maxnum + 1