4

Apparently, I shouldn't be using ScrapyFileLogObserver anymore (http://doc.scrapy.org/en/1.0/topics/logging.html). But I still want to be able to save my log messages to a file, and I still want all the standard Scrapy console information to be saved to the file too.

From reading up on how to use the logging module, this is the code that I have tried to use:

class BlahSpider(CrawlSpider):
    name = 'blah'
    allowed_domains = ['blah.com']
    start_urls = ['https://www.blah.com/blahblahblah']

    rules = (
        Rule(SgmlLinkExtractor(allow=r'whatever'), callback='parse_item', follow=True),
    )

    def __init__(self):
        CrawlSpider.__init__(self)
        self.logger = logging.getLogger()
        self.logger.setLevel(logging.DEBUG)
        logging.basicConfig(filename='debug_log.txt', filemode='w', format='%(asctime)s %(levelname)s: %(message)s',
                            level=logging.DEBUG)
        console = logging.StreamHandler()
        console.setLevel(logging.DEBUG)
        simple_format = logging.Formatter('%(levelname)s: %(message)s')
        console.setFormatter(simple_format)
        self.logger.addHandler(console)
        self.logger.info("Something")

    def parse_item(self):
        i = BlahItem()
        return i

It runs fine, and it saves the "Something" to the file. However, all of the stuff that I see in the command prompt window, all of the stuff that used to be saved to the file when I used ScrapyFileLogObserver, is not saved now.

I thought that my "console" handler with "logging.StreamHandler()" was supposed to deal with that, but this is just what I had read and I don't really understand how it works.

Can anyone point out what I am missing or where I have gone wrong?

Thank you.

Joe_AK
  • 421
  • 2
  • 4
  • 16

2 Answers2

5

I think the problem is that you've used both basicConfig and addHandler.

Configure two handlers separately:

self.logger = logging.getLogger()
self.logger.setLevel(logging.DEBUG)

logFormatter = logging.Formatter('%(asctime)s %(levelname)s: %(message)s')

# file handler
fileHandler = logging.FileHandler("debug_log.txt")
fileHandler.setLevel(logging.DEBUG)
fileHandler.setFormatter(logFormatter)
self.logger.addHandler(fileHandler)

# console handler
consoleHandler = logging.StreamHandler()
consoleHandler.setLevel(logging.DEBUG)
consoleHandler.setFormatter(logFormatter)
self.logger.addHandler(consoleHandler)

See also:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
2

you can log all scrapy logs to file by first disabling root handle in scrapy.utils.log.configure_logging and then adding your own log handler.

In settings.py file of scrapy project add the following code:

import logging
from logging.handlers import RotatingFileHandler

from scrapy.utils.log import configure_logging

LOG_ENABLED = False
# Disable default Scrapy log settings.
configure_logging(install_root_handler=False)

# Define your logging settings.
log_file = '/tmp/logs/CRAWLER_logs.log'

root_logger = logging.getLogger()
root_logger.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
rotating_file_log = RotatingFileHandler(log_file, maxBytes=10485760, backupCount=1)
rotating_file_log.setLevel(logging.DEBUG)
rotating_file_log.setFormatter(formatter)
root_logger.addHandler(rotating_file_log)

Also we customize log level (DEBUG to INFO) and formatter as required. Hope this helps!

hemraj
  • 964
  • 6
  • 14