0

I used the PyInstaller command shown below to create a .exe file for my run_spider.py Python script, which initiates the crawling process of spider that has rotation of user agents with the help of scrapy-user-agents library :

pyinstaller -y -F --hidden-import scrapy.spiderloader --hidden-import scrapy.statscollectors --hidden-import scrapy.logformatter --hidden-import scrapy.extensions --hidden-import scrapy.extensions.corestats --hidden-import scrapy.extensions.corestats --hidden-import scrapy.extensions.telnet --hidden-import scrapy.extensions.memusage --hidden-import scrapy.extensions.memdebug --hidden-import scrapy.extensions.closespider --hidden-import scrapy.extensions.feedexport --hidden-import scrapy.extensions.logstats --hidden-import scrapy.extensions.spiderstate --hidden-import scrapy.extensions.throttle --hidden-import scrapy.core.scheduler --hidden-import scrapy.squeues --hidden-import queuelib --hidden-import scrapy.core.downloader --hidden-import scrapy.downloadermiddlewares --hidden-import scrapy.downloadermiddlewares.robotstxt --hidden-import scrapy.downloadermiddlewares.httpauth --hidden-import scrapy.downloadermiddlewares.downloadtimeout --hidden-import scrapy.downloadermiddlewares.defaultheaders --hidden-import scrapy.downloadermiddlewares.useragent --hidden-import scrapy.downloadermiddlewares.retry --hidden-import scrapy.downloadermiddlewares.ajaxcrawl --hidden-import scrapy.downloadermiddlewares.redirect --hidden-import scrapy.downloadermiddlewares.httpcompression --hidden-import scrapy.downloadermiddlewares.redirect --hidden-import scrapy.downloadermiddlewares.cookies --hidden-import scrapy.downloadermiddlewares.httpproxy --hidden-import scrapy.downloadermiddlewares.stats --hidden-import scrapy.downloadermiddlewares.httpcache --hidden-import scrapy.spidermiddlewares --hidden-import scrapy.spidermiddlewares.httperror --hidden-import scrapy.spidermiddlewares.offsite --hidden-import scrapy.spidermiddlewares.referer --hidden-import scrapy.spidermiddlewares.urllength --hidden-import scrapy.spidermiddlewares.depth --hidden-import scrapy.pipelines --hidden-import scrapy.dupefilters --hidden-import scrapy.core.downloader.handlers.datauri --hidden-import scrapy.core.downloader.handlers.file --hidden-import scrapy.core.downloader.handlers.http --hidden-import scrapy.core.downloader.handlers.s3 --hidden-import scrapy.core.downloader.handlers.ftp --hidden-import scrapy.core.downloader.webclient --hidden-import scrapy.core.downloader.contextfactory --hidden-import makeexe.settings --hidden-import scrapy_user_agents --hidden-import scrapy_user_agents.middlewares "C:\Users\hp\PycharmProjects\scrrapyexe\makeexe\run_spider.py"

When I run the run_spider.exe, it initially runs but eventually throws below-shown error that is related to scrapy-user-agents library. Do you have any suggestions on how to resolve them?

C:\Users\hp\PycharmProjects\scrrapyexe\makeexe\dist>C:\Users\hp\PycharmProjects\scrrapyexe\makeexe\dist\run_spider.exe
2023-03-08 15:51:57 [scrapy.utils.log] INFO: Scrapy 2.8.0 started (bot: makeexe)
2023-03-08 15:51:57 [scrapy.utils.log] INFO: Versions: lxml 4.9.2.0, libxml2 2.9.12, cssselect 1.2.0, parsel 1.7.0, w3lib 2.1.1, Twisted 22.10.0, Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)], pyOpenSSL 23.0.0 (OpenSSL 3.0.8 7 Feb 2023), cryptography 39.0.2, Platform Windows-10-10.0.19045-SP0
2023-03-08 15:51:57 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'makeexe',
 'FEED_EXPORT_ENCODING': 'utf-8',
 'NEWSPIDER_MODULE': 'makeexe.spiders',
 'REQUEST_FINGERPRINTER_IMPLEMENTATION': '2.7',
 'ROBOTSTXT_OBEY': True,
 'SPIDER_MODULES': ['makeexe.spiders'],
 'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'}
2023-03-08 15:51:57 [asyncio] DEBUG: Using selector: SelectSelector
2023-03-08 15:51:57 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor
2023-03-08 15:51:57 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.windows_events._WindowsSelectorEventLoop
2023-03-08 15:51:57 [scrapy.extensions.telnet] INFO: Telnet Password: 3b47dca42680faf1
2023-03-08 15:51:57 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.logstats.LogStats']
Unhandled error in Deferred:
2023-03-08 15:51:58 [twisted] CRITICAL: Unhandled error in Deferred:

Traceback (most recent call last):
  File "scrapy\crawler.py", line 233, in crawl

  File "scrapy\crawler.py", line 237, in _crawl

  File "twisted\internet\defer.py", line 1947, in unwindGenerator

  File "twisted\internet\defer.py", line 1857, in _cancellableInlineCallbacks

--- <exception caught here> ---
  File "twisted\internet\defer.py", line 1697, in _inlineCallbacks

  File "scrapy\crawler.py", line 122, in crawl

  File "scrapy\crawler.py", line 136, in _create_engine

  File "scrapy\core\engine.py", line 78, in __init__

  File "scrapy\core\downloader\__init__.py", line 85, in __init__

  File "scrapy\middleware.py", line 68, in from_crawler

  File "scrapy\middleware.py", line 44, in from_settings

  File "scrapy\utils\misc.py", line 165, in create_instance

  File "scrapy_user_agents\middlewares.py", line 37, in from_crawler

  File "scrapy_user_agents\middlewares.py", line 30, in __init__

builtins.FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\hp\\AppData\\Local\\Temp\\_MEI88722\\scrapy_user_agents\\default_uas.txt'

2023-03-08 15:51:58 [twisted] CRITICAL:
Traceback (most recent call last):
  File "twisted\internet\defer.py", line 1697, in _inlineCallbacks
  File "scrapy\crawler.py", line 122, in crawl
  File "scrapy\crawler.py", line 136, in _create_engine
  File "scrapy\core\engine.py", line 78, in __init__
  File "scrapy\core\downloader\__init__.py", line 85, in __init__
  File "scrapy\middleware.py", line 68, in from_crawler
  File "scrapy\middleware.py", line 44, in from_settings
  File "scrapy\utils\misc.py", line 165, in create_instance
  File "scrapy_user_agents\middlewares.py", line 37, in from_crawler
  File "scrapy_user_agents\middlewares.py", line 30, in __init__
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\hp\\AppData\\Local\\Temp\\_MEI88722\\scrapy_user_agents\\default_uas.txt'

run_spider.py script

from amazon_tutorial.spiders.amazon import AmazonSpider
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings

if __name__ == '__main__':
    settings = get_project_settings()
    process = CrawlerProcess(settings)
    process.crawl(AmazonSpider)
    process.start()

spider.py script

import scrapy


class QuoteSpider(scrapy.Spider):
    name = "book"
    start_urls = ["https://www.amazon.com/gp/new-releases/books/?ie=UTF8&ref_=sv_b_2"]

    def parse(self, response):
        book = response.css('.a-link-normal span div::text').get()
        price = response.css('._cDEzb_p13n-sc-price_3mJ9Z::text').get()
        print(f'Book name: {book}')
        print(f'Price: {price}')
        yield {
            'Book name': book,
            'Price': price
        }
Hassaan
  • 11
  • 3

0 Answers0