4

I am trying to make Scrapy output colorized logs. I am not so familiar with Python logging, but my understanding is that I must make my own Formatter and make it use by Scrapy. I succeeded in making a Formatter to colorized the output using Clint.

My problem is that I can't make it work within Scrapy correctly. I would have expected the logger object in my spider to have a handler, then I would have switched the formatter of that handler. When I looks what is inside spider.logger.logger, I see that handler is an empty list. I tried to add my formatter in a new stream handler doing.

crawler.spider.logger.logger.addHandler(sh) where sh is a handler using my color formatter.

This have for effect to make scrappy output each messages twice. First message is colorized but doesn't have Scrapy formatting. The second one has Scrapy formatting with no colors.

How can I make Scrapy output colorized logs keeping the same format that can be set in settings.py

Thanks

1 Answers1

15

If you mean to colorize LogRecord only, you can customize LOG_FORMAT in settings.py with ANSI escape codes.

Example:

LOG_FORMAT = '\x1b[0;0;34m%(asctime)s\x1b[0;0m \x1b[0;0;36m[%(name)s]\x1b[0;0m \x1b[0;0;31m%(levelname)s\x1b[0;0m: %(message)s'

If you also want to colorize different log levels with different colors, you can override scrapy.utils.log._get_handler(source code).

Put this near the top of your settings.py

import scrapy.utils.log

_get_handler = copy.copy(scrapy.utils.log._get_handler)


def _get_handler_custom(*args, **kwargs):
    handler = _get_handler(*args, **kwargs)
    handler.setFormatter(your_custom_formatter)
    return handler

scrapy.utils.log._get_handler = _get_handler_custom

What it does is reset the formatter after calling the original _get_handler, and then reattach it to scrapy.utils.log. This is a hacky solution and might not be the best practice, but it just works.

A more proper way to achieve this is to override logging.StreamHandler. There is a bunch of discussion on SO which can lead you to the right direction.

Here I provide my full working codes used in my projects (a third-party package colorlog is in use).

settings.py

import copy

from colorlog import ColoredFormatter
import scrapy.utils.log

color_formatter = ColoredFormatter(
    (
        '%(log_color)s%(levelname)-5s%(reset)s '
        '%(yellow)s[%(asctime)s]%(reset)s'
        '%(white)s %(name)s %(funcName)s %(bold_purple)s:%(lineno)d%(reset)s '
        '%(log_color)s%(message)s%(reset)s'
    ),
    datefmt='%y-%m-%d %H:%M:%S',
    log_colors={
        'DEBUG': 'blue',
        'INFO': 'bold_cyan',
        'WARNING': 'red',
        'ERROR': 'bg_bold_red',
        'CRITICAL': 'red,bg_white',
    }
)

_get_handler = copy.copy(scrapy.utils.log._get_handler)

def _get_handler_custom(*args, **kwargs):
    handler = _get_handler(*args, **kwargs)
    handler.setFormatter(color_formatter)
    return handler

scrapy.utils.log._get_handler = _get_handler_custom
amigcamel
  • 1,879
  • 1
  • 22
  • 36
  • I know we're not supposed to say thanks, but thanks! Great answer :D – Sebastian Thomas May 01 '20 at 16:41
  • This is great, I just coded that into my project and honestly it seems like it should ship with Scrapy by default, perhaps a pull request / scrapy contribution on github is in order? – Alex Thompson Apr 26 '21 at 04:47
  • Here if you use windows to activate the ansi color in the cmd prompt console : https://superuser.com/questions/413073/windows-console-with-ansi-colors-handling – Dorian Grv Jul 02 '22 at 20:03
  • You can also use it then in print for example for a yield : `print("\x1b[1;0;32m---Found something---")` More info about the ANSI codes : https://tforgione.fr/posts/ansi-escape-codes/ – Dorian Grv Jul 03 '22 at 12:54