2

I have been using the method described on stackoverflow (https://stackoverflow.com/a/43661172/5037146) , to make scrapy run from script using Crawler Runner to allow to restart the process.

However, I don't get any console logs when running the process through CrawlerRunner, whereas when I using CrawlerProcess, it outputs the status and progress.

Code is available online: https://colab.research.google.com/drive/14hKTjvWWrP--h_yRqUrtxy6aa4jG18nJ

Aerodynamic
  • 782
  • 5
  • 19

2 Answers2

2

With CrawlerRunner you need to manually setup logging, which you can do with configure_logging(). See https://docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script

Gallaecio
  • 3,620
  • 2
  • 25
  • 64
  • Thanks man, it works ! I also found out that using CrawlerProcess instead or CrawlerRunner automatically yields logs. – Aerodynamic Sep 04 '19 at 08:27
  • 1
    I found that you need to configure both `LOG_FILE` inside configure_logging and settings, in order for the `CrawlerRunner` to log properly. – Rakka Alhazimi Dec 08 '22 at 21:33
1

When you use CrawlerRunner you have to manually configure a logger You can do it using scrapy.utils.log.configure_logging function

for example

import scrapy.crawler
from my_spider import MySpider

runner = scrapy.crawler.CrawlerRunner()
scrapy.utils.log.configure_logging(
            {
                "LOG_FORMAT": "%(levelname)s: %(message)s",
            },
        )
crawler = runner.create_crawler(MySpider)
crawler.crawl()
Alon Barad
  • 1,491
  • 1
  • 13
  • 26