0

I'm using Python 2.7 with Eclipse. I'm doing a tutorial that builds a basic web scraper with Scrapy. Here is the link.

http://www.youtube.com/watch?v=4fbvkMhvsWY

Before launching the scraper in command prompt I received "unresolved import" errors when attempting the following lines of code:

from scrapy.spider import BaseSpider

from scrapy.selector import HtmlXPathSelector

When I attempt to crawl in command prompt with the following command:

scrapy crawl myfile

I get the error, "Spider not found: myfile".

I also get another unresolved import error in my items.py file. "Field" not only gets the "unresolved import" error, but it also gets the "unused import" error.

code from items.py file:

from scrapy.item import Item, Field

Here is the code from the spider file:

Spider file(named Tutorial1.py)

from scrapy.spider import BaseSpider

from scrapy.selector import HtmlXPathSelector

class Tutorial1 (BaseSpider):
    name="Tutorial1"

    allowed_domains=['http://wikipedia.org']
    start_urls = ["http://en.wikipedia.org/wiki/Home_page",]

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        print hxs.select('//div/a').extract()

Also when attempting to do other tutorials I experience the same issues leading me to believe that this has something to do with my directory. I'm not sure though.

I've found other individuals are having similar problems.

Scrapy: ImportError: No module named items

Scrapy spider is not working

My system path looks like this:

C:\Python27;C:\Python27\Scripts

I do not get errors when importing the following:

import zope.interface

import twisted

import lxml

import OpenSSL

import scrapy

Please help me figure this out. Thanks in advance.

Community
  • 1
  • 1

1 Answers1

0

The name of your spider is the parameter that should be used in your scrapy crawl command. This name is set in your spider code (name = "Tutorial1") so running the command scrapy crawl Tutorial1 should fix the command line problem.

As for the import errors, I've noticed that you're on Windows. Installing scrapy on Windows (7) can be more involved than for other operating systems. This article recommends additionally installing pyopenssl, w3lib and pywin32.

What version of scrapy are you using?

Talvalin
  • 7,789
  • 2
  • 30
  • 40
  • Hey Talvalin. Thanks for the feedback. Installation for Windows is indeed a process. Took me several hours. I used easy_install and downloaded .16 version of Scrapy. I'll try installing pyopenssl, w3lib, and pywin32. I appreciate the feedback. This has really been a headache. – Young Grasshopper Jan 15 '13 at 01:25
  • Apparently the 64 bit version of Python is very difficult to work with when trying to use Scrapy. I'm using the 64 bit version of Python. I'm going to uninstall it and re-install the 32 bit version of Python. I'll let you know how it goes. Thanks for your help so far. – Young Grasshopper Jan 22 '13 at 23:44
  • I believe I've fixed it. The 32 bit version is the way to go if you want to use Scrapy on Windows. – Young Grasshopper Jan 23 '13 at 01:54