1

Basically, I have a working version of middleware to pass all requests through selenium and return HtmlResponse, the problem is I also want to have some meta data to be attached to the request which I can access in parse method of spider. For some reason I can't access it in parse method of spider, could you help me please?

middleware.py

def process_request(self, request, spider):
    request = request.replace(meta={'test': 'test'})
    self.driver.get(request.url)
    body = self.driver.page_source
    return HtmlResponse(self.driver.current_url, body=body, encoding='utf-8', request=request) 

spider.py

def parse(self, response):
    yield {'meta': response.meta}

1 Answers1

0

In the yield of the first function where we Request URL, there is another argument called meta by which we can pass details from the first function to the second one in a dictionary.

For example, in the first function, we well have:

yield Request(url, callback=self.parse_function, meta={'Date Added':date, 'Category':category}) 

In the second function, what we yield is rather this meta  dictionary. This will be like this:

yield response.meta 

What about the details we get from the second function? We should first add them to the dictionary before yield, just as adding to any dictionary, like this:

response.meta['Brand'] = brand
response.meta['Model'] = model
response.meta['Price'] = price
V-cash
  • 330
  • 3
  • 14