Scrapy CrawlerRunner: Output missing

Question

I have been using the method described on stackoverflow (https://stackoverflow.com/a/43661172/5037146) , to make scrapy run from script using Crawler Runner to allow to restart the process.

However, I don't get any console logs when running the process through CrawlerRunner, whereas when I using CrawlerProcess, it outputs the status and progress.

Code is available online: https://colab.research.google.com/drive/14hKTjvWWrP--h_yRqUrtxy6aa4jG18nJ

score 2 · Accepted Answer · answered Sep 03 '19 at 10:12

2

With CrawlerRunner you need to manually setup logging, which you can do with configure_logging(). See https://docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script

answered Sep 03 '19 at 10:12

Gallaecio

3,620
2
25
64

Thanks man, it works ! I also found out that using CrawlerProcess instead or CrawlerRunner automatically yields logs. – Aerodynamic Sep 04 '19 at 08:27
1

I found that you need to configure both `LOG_FILE` inside configure_logging and settings, in order for the `CrawlerRunner` to log properly. – Rakka Alhazimi Dec 08 '22 at 21:33

score 1 · Answer 2 · answered Feb 25 '23 at 23:17

When you use CrawlerRunner you have to manually configure a logger You can do it using scrapy.utils.log.configure_logging function

for example

import scrapy.crawler
from my_spider import MySpider

runner = scrapy.crawler.CrawlerRunner()
scrapy.utils.log.configure_logging(
            {
                "LOG_FORMAT": "%(levelname)s: %(message)s",
            },
        )
crawler = runner.create_crawler(MySpider)
crawler.crawl()

Scrapy CrawlerRunner: Output missing

2 Answers2