0

At design time and using autocomplete, I can add the line from myitems import CarItem.

However, when I run my spider scrapy crawl keizer -o allobjects.json,

I get error:

ModuleNotFoundError: No module named 'myitems'

Output:

  File "C:\scrapy\hw_spiders\spiders\keizer.py", line 11, in <module>
    from myitems import CarItem
ModuleNotFoundError: No module named 'myitems'

My folder structure:

enter image description here

My files:

keizer.py

import json
import re
import os

import scrapy
import time
from scrapy_splash import SplashRequest
from scrapy.selector import Selector
from scrapy.http import HtmlResponse

from myitems import CarItem

Not sure if it's relevant, I also added this to ".vscode\settings.json" file:

{
    "python.analysis.extraPaths": [
        "./hw_spiders"
    ]
}

I already checked here and read about relative imports, but I don't know how to map the proposed solutions to my current project structure as they're quite different. If my project structure is wrong or not recommended, I'd love to hear that too.

Another attempt

I changed from myitems import CarItem to from .myitems import CarItem. I immediately see a design time error:

Import ".myitems" could not be resolved (PylancereportMissingImports)

But I ran the spider anyway.

When I run C:\scrapy\hw_spiders> scrapy crawl keizer -o allobjects.json, I get:

File "C:\scrapy\hw_spiders\spiders\keizer.py", line 11, in from .myitems import CarItem ModuleNotFoundError: No module named 'hw_spiders.spiders.myitems'

When I run C:\scrapy> scrapy crawl keizer -o allobjects.json, I get:

File "C:\scrapy\hw_spiders\spiders\keizer.py", line 11, in from .myitems import CarItem ModuleNotFoundError: No module named 'hw_spiders.spiders.myitems'

aaron
  • 39,695
  • 6
  • 46
  • 102
Adam
  • 6,041
  • 36
  • 120
  • 208

2 Answers2

1

If the myitems folder is in the same directory folder, Then try as follows:

from ..myitems import CarItem
Md. Fazlul Hoque
  • 15,806
  • 5
  • 12
  • 32
  • Hi Fazlul. When I add your answer and then debug it, I get the following error: `runspider: error: Unable to load 'c:\\scrapy\\hw_spiders\\spiders\\keizer.py': attempted relative import with no known parent package` Do you know how to fix that? – Adam Nov 09 '21 at 17:13
  • Hi@ Adam, It's hard to guess without full scrapy project. When you will use project then you have to use meaning you have to use project and terminal run command is as follows: scrapy crawl name of your spider, this is my present thinking. Thanks – Md. Fazlul Hoque Nov 09 '21 at 17:27
  • Hi Fazlull. When I run `scrapy crawl -o allobjects.json` I get error `twisted.internet.error.ReactorNotRestartable` When I debug and step through I get the mentioned error after all spider code has been executed. My full project is here https://file.io/MTKJEckHgoal (password `Fazlul`), if you're willing and have time to take a look :) – Adam Nov 09 '21 at 18:40
  • Hi@ Adam, scrapy project meaning hw_spiders and BOT_NAME = 'hw' in settings.file aren't the same. They must be the same meaning hw_spiders and BOT_NAME = 'hw'_spiders. And it may remain more errors. After 2-3 days I will free to deeply analize it. Create new project and debug, don't override any scrapy project files – Md. Fazlul Hoque Nov 09 '21 at 19:36
  • and gain more knowledge from online only for scrapy project portion – Md. Fazlul Hoque Nov 09 '21 at 19:46
  • The weird thing is that other spiders in that same project do function correctly, I can't however share those due to privacy, so it's something with the one I included it the project I shared. I massively appreciate any and all help! :) – Adam Nov 09 '21 at 21:08
  • Hi Fazlul, any chance you had some time to look at my project :) O:) – Adam Nov 14 '21 at 15:34
  • Hi Adam. I'm sorry for nothing can do on it. It would be better to create a new post more focus on it. Thanks – Md. Fazlul Hoque Nov 15 '21 at 10:19
  • Thanks. I just started a new thread with some more details, please check here if you like :) https://stackoverflow.com/questions/69975014/scrap-splash-crawler-reactornotrestartable – Adam Nov 15 '21 at 13:18
-1

Scrapy should be installed in your current used python environment.

Open an integrated Terminal and run

pip show scrapy

Check if its location is current environment\lib\site-packages.

If not, please reinstall it by running pip install scrapy.

Molly Wang-MSFT
  • 7,943
  • 2
  • 9
  • 22