1

When I use Selenium I can see the Browser GUI, is it somehow possible to do with scrapy or is scrapy strictly command line based?

Ivan Bilan
  • 2,379
  • 5
  • 38
  • 58
  • 1
    [Scrapy](http://scrapy.org/) is a scraping and web crawling framework, while [selenium](http://www.seleniumhq.org/) is for web browser automation, they are not the same, and one can't replicate the other. – eLRuLL Nov 03 '15 at 23:42

3 Answers3

3

No, scrapy doesn't support that.

Scrapy is designed for web crawler, while Selenium is used for browser automation testing. it would cost much resources if you open a browser for each request to a web crawler.

If you planned to crawl dynamic content, you can refer here: Can scrapy be used to scrape dynamic content from websites that are using AJAX?

Community
  • 1
  • 1
tixie
  • 99
  • 5
1

Scrapy by itself does not control browsers.

However, you could start a Selenium instance from a Scrapy crawler. Some people design their Scrapy crawler like this. They might process most pages only using Scrapy but fire Selenium to handle some of the pages they want to process.

Louis
  • 146,715
  • 28
  • 274
  • 320
1

Build a crawler system for dynamic websites is not easy task. While you can use a web browser automator (like selenium), or event when you can integrate selenium with nutch (by using nutch-selenium). These solutions are still hard to develop, hard to test and hard to manage sessions because we still "translate" our process to languages (such as java or python)

I suppose a new approach for this problem. Instead of using a web browser automator, we can inject native javascript codes into browser (via extension or add-on).The advantages of this approach is we can easily inject third party libraries (like jquery (for dom selector), Run.js (for complicated process) and APIs that supported by browsers). And we can take advance of debugging tool and testing framework in javascript world.

I just build a system for crawl dynamic websites and it worked very well (compare with nutch-selenium).

Vu Anh
  • 955
  • 1
  • 18
  • 29