0

I wrote a program that is going to run for one whole day and I would like to capture the entire logging output in a manageable manner. Specifically, I would like to have the program receiving a maximum file size as a configuration parameter, and then have a logging handler in place to capture all logging but that ensures that, after execution halts, the output is NOT one single log file, but a collection of sequentially identified logging files, each with a file size under the specified given input parameter.

Could this be done by specifying a logging handler? I know that "Handler objects are responsible for dispatching the appropriate log messages (based on the log messages’ severity) to the handler’s specified destination".

Current research:

  1. This URL How to split a log file into several csv files with python is about already having one big file and then splitting it. I would like to generate the collection of files.
  2. I have checked https://docs.python.org/3/howto/logging.html but I do not see a specific example that allows me to control to maximum size of an item in a set of sequentially identified output files.
  3. I have checked a few handlers such as https://docs.python.org/3/library/logging.handlers.html#logging.handlers.RotatingFileHandler . Particularly, this one allows to control the maximum file size per rotation, but this does NOT behave as I need to.
  4. I have checked https://docs.python.org/3/howto/logging-cookbook.html#using-file-rotation which look very close as to how I would like the output to be, but the file rotation is not what I require.

I believe that deriving the https://docs.python.org/3/library/logging.handlers.html#filehandler class to allow for this custom behavior would be a good idea, since my program is comprised of many modules, the logging of all of which I am interested in capturing.

Pseudo code of desired behavior: Hopefully, this pseudo code helps me convey the kind of behavior I have in mind.

file_name = "output.{id}.log"
max_log_file_size_byte =  26214400  # 25 MB.
file_number = 0
logging.basicConfig(filename=file_name.format(id=str(file_number)),
                    encoding='utf-8',
                    format='%(asctime)s - %(levelname)s - %(message)s',
                    level=logging.DEBUG)
while True:
    Take file 0 size in bytes.
    if File 0 size > max_log_file_size_byte:
        file_number = file_number + 1
        logging.basicConfig(filename=file_name.format(id=str(file_number)),
                            encoding='utf-8',
                            format='%(asctime)s - %(levelname)s - %(message)s',
                            level=logging.DEBUG)
    App.Function1WhichNeedsToLog()
    App.Function2WhichNeedsToLog()
Eduardo
  • 697
  • 8
  • 26

1 Answers1

0

I created a derived class from logging.FileHandler as follows:

import logging
import os


class SectionedFileHandler(logging.FileHandler):
    """A handler that sections a log file into sections approximately of given size.
    """
    def __init__(self, section_name_pattern, mode='a', encoding=None, delay=False, errors=None,
                 max_bytes_ref=1048576):
        """Prepares section file name pattern; instanciates base class.
        """
        self._current_section = 0
        self._section_name_pattern = section_name_pattern
        self._section_file_name = self._section_name_pattern.format(ch=self._current_section)
        logging.FileHandler.__init__(self, filename=self._section_file_name, mode=mode, encoding=encoding, delay=delay, errors=errors)
        self._max_bytes_ref = max_bytes_ref
        print(self._max_bytes_ref)
        
    def emit(self, record):
        """Overloaded emission of a record.

        Checks the size of the current section before every emit. When it exceeds the given
        reference, it updates the self.stream derived attribute via the self.setStream, so
        logging continues of the next section.
        """
        try:
            file_stats = os.stat(self.baseFilename)
        except FileNotFoundError:
            file_stats = None
        if file_stats and file_stats.st_size > self._max_bytes_ref:
            self._current_section = self._current_section + 1
            self._section_file_name = self._section_name_pattern.format(ch=self._current_section)
            # Keep the absolute path, otherwise derived classes which use this
            # may come a cropper when the current directory changes:
            self.baseFilename = os.path.abspath(self._section_file_name)
            self.setStream(open(self.baseFilename, self.mode))
        logging.FileHandler.emit(self, record)

The following minimimal working example works as required:

logging.basicConfig(
    level=logging.DEBUG,
    format="%(asctime)s [%(levelname)s] %(message)s",
    handlers=[SectionedFileHandler("debug.{ch}.log", max_bytes_ref=100048), logging.StreamHandler()]
)

number = 0
while True:
    number += 1
    logging.debug(str(number))

The following links were very useful:

  1. https://github.com/python/cpython/blob/main/Lib/logging/handlers.py
  2. https://www.digitalocean.com/community/tutorials/how-to-get-file-size-in-python
  3. logger configuration to log to file and print to stdout
  4. How to change filehandle with Python logging on the fly with different classes and imports
Eduardo
  • 697
  • 8
  • 26