1

I am trying to create an iron.io worker using scrapy.

According to iron.io we need to place all the dependencies for the code in the worker itself.

I have created a folder called module which will have all the 3rd party modules and installed scrapy via pip.

pip install scrapy -t module/

When trying to run scrapy via python module/scrapy/__init__.py I am getting

Traceback (most recent call last):
  File "module/scrapy/__init__.py", line 10, in <module>
    __version__ = pkgutil.get_data(__package__, 'VERSION').decode('ascii').strip()
  File "/usr/lib/python2.7/pkgutil.py", line 578, in get_data
    loader = get_loader(package)
  File "/usr/lib/python2.7/pkgutil.py", line 464, in get_loader
    return find_loader(fullname)
  File "/usr/lib/python2.7/pkgutil.py", line 474, in find_loader
    for importer in iter_importers(fullname):
  File "/usr/lib/python2.7/pkgutil.py", line 424, in iter_importers
    if fullname.startswith('.'):
AttributeError: 'NoneType' object has no attribute 'startswith'
Hari K T
  • 4,174
  • 3
  • 32
  • 51

2 Answers2

1

If you don't have Scrapy executable available, you can run Scrapy via cmdline:

python module/scrapy/cmdline.py

You can also run Scrapy from script. Here is a very detailed answer.

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
0

You'd probably be better off using Scrapy from your IronWorker code rather than calling it from the command line, just like it has on the front page of http://scrapy.org/ or in the tutorial: http://doc.scrapy.org/en/0.24/intro/tutorial.html

To use this in IronWorker, after you've done the pip install, be sure to add:

pip 'scrapy' 

to your .worker file. Then in your worker script, you'd import it:

import scrapy

Then use it like it says in the tutorial link above.

Travis Reeder
  • 38,611
  • 12
  • 87
  • 87
  • I like @alecxe answer how to run scrapy from script . Your answer is correct regarding how to install scrapy on iron worker. – Hari K T Mar 18 '15 at 02:15