I am new to Python and Scrapy. I have not used callback functions before. However, I do now for the code below. The first request will be executed and the response of that will be sent to the callback function defined as second argument:
def parse_page1(self, response):
item = MyItem()
item['main_url'] = response.url
request = Request("http://www.example.com/some_page.html",
callback=self.parse_page2)
request.meta['item'] = item
return request
def parse_page2(self, response):
item = response.meta['item']
item['other_url'] = response.url
return item
I am unable to understand following things:
- How is the
item
populated? - Does the
request.meta
line executes before theresponse.meta
line inparse_page2
? - Where is the returned
item
fromparse_page2
going? - What is the need of the
return request
statement inparse_page1
? I thought the extracted items need to be returned from here.