0

I am just confused behind the logic of requesting single page twice when scrapy is used with selenium webdriver. In most of cases I see following code in parse function e.g

def parse(self, response):
    self.selenium.get(response.url)
    # do some other stuff
    self.selenium.close()

So my question is , when parse function gets the response, scrapy has already made an http request to page ? and in the function body we are making same request using selenium driver ? If my assumption is true how we can avoid this ? If not what is in the response arugment passed?

Some examples 1, 2

furas
  • 134,197
  • 12
  • 106
  • 148
sakhunzai
  • 13,900
  • 23
  • 98
  • 159
  • you can use `Selenium` without `scrapy` to get data from page so `scrapy` seems usless in this example. But `scrapy` gives other useful elements like Pipelines to clear data, saving in csv, json, xml, filtering duplicated urls, downloading files/images, etc. – furas Dec 29 '17 at 12:56
  • BTW: based on [image in documentation](https://doc.scrapy.org/en/latest/topics/architecture.html) you should have to replace "downloader" to use only Selenium - and this can be a problem. – furas Dec 30 '17 at 03:23
  • yes I see its trying to place multiple requests asynchronously – sakhunzai Jan 01 '18 at 07:07

0 Answers0