0

I am working with a script where i need to crawl websites, need to crawl only base_url site. Anyone who has pretty good idea how i can launch scarpy in custom python scripts and get urls link in list?

acekapila
  • 35
  • 2
  • 10
  • FYI, here is [a detailed answer](http://stackoverflow.com/questions/18838494/scrapy-very-basic-example/27744766#27744766) about running Scrapy from script. – alecxe Mar 17 '15 at 15:16

2 Answers2

0

You can use a file to pass the urls from scrapy to your python script.

Or you can print the urls with a mark in your scrapy, and use your python script to catch the stdout of your scrapy.Then parse it to list.

amow
  • 2,203
  • 11
  • 19
0

You can add Scrapy commands from an external library by adding scrapy.commands section into entry_points in the setup.py.

from setuptools import setup, find_packages

setup(name='scrapy-mymodule',
  entry_points={
    'scrapy.commands': [
      'my_command=my_scrapy_module.commands:MyCommand',
    ],
  },
 )

http://doc.scrapy.org/en/latest/experimental/index.html?highlight=library#add-commands-using-external-libraries

Also see Scrapy Very Basic Example.

Community
  • 1
  • 1
Steven Almeroth
  • 7,758
  • 2
  • 50
  • 57