1

I have a Driver.py scripts where it calls multiple threads based on the given inputs. Threads are basically runs a module of a selected object. So Driver.py may call thread_1.run(), thread_2.run(), thread_3.run(), and continue its process.

Driver.py logs its output into a main.log folder, where I want threads to log their output into unique filenames for each. Driver.py and Threads also uses common modules that are defined on differnt files, where they also logs information.

I call setup_load("main.log") first on Driver, afterwards in each Thread, called setup_load(f"thread_{jobid}.log") as well. I realize that when Thread is called, now Driver.py writes into thread's log file. I may use a differnt logger inside Thread but when the Thread calls another modules, since those common modules are using import logging they write into the root logger's defined filename.


=> Is it possible to log messages from different threads to different files? I found multiple answers on SO (for example), but non of them covers when another module is called on a different file, how would they can find out that which logger they can use.

=> So the problem I am facing is since every thread is using the same underlying logger, when I change the file path of the logging.basicConfig in one thread, it affects the class across all threads and the driver, since they're all using it.

=> How would be functions from different modules called from the thread or driver would understand which logger should it choose?


Comment section on How to change filehandle with Python logging on the fly with different classes and imports has a discussion and recommended solution.

@Martijn Pieters:

next option: create per-thread handlers, give each of them a filter that filters on the logrecord thread attribute. Attach a filter to any other handlers that returns False for logrecords with thread set

alper
  • 2,919
  • 9
  • 53
  • 102

1 Answers1

4

Yes, you can direct log entries from different threads to different files. You'll need to:

  • Create a log filter that can filter records by their LogRecord.thread or LogRecord.threadName attribute
  • Create a filter that does not accept records with specific or all thread ids.
  • Create a log handler per thread, giving it a log filter that only accepts logrecords for their specific thread.
  • Attach the filter that ignores log records for your threads to any other handlers.

When filtering, you have the choice between filtering on thread id (the value returned by threading.get_ident()) or thread name (whatever you passed in as the name argument to the Thread() object). If you have a pattern for your thread names, this is where you'd use it.

Creating a custom filter is easy enough:

import threading
from logging import Filter

class ThreadFilter(Filter):
    """Only accept log records from a specific thread or thread name"""

    def __init__(self, threadid=None, threadname=None):
        if threadid is None and threadname is None:
            raise ValueError("Must set at a threadid and/or threadname to filter on")
        self._threadid = threadid
        self._threadname = threadname

    def filter(self, record):
        if self._threadid is not None and record.thread != self._threadid:
            return False
        if self._threadname is not None and record.threadName != self._threadname:
            return False
        return True

class IgnoreThreadsFilter(Filter):
    """Only accepts log records that originated from the main thread"""

    def __init__(self):
        self._main_thread_id = threading.main_thread().ident

    def filter(self, record):
        return record.thread == self._main_thread_id

If you want to match a specific pattern, adjust the code accordingly.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • I get confused on, if I have a module which is called by the main thread and/or specific thread like: `def hello_world(): logging.info("hello")`; do I need to make any change on that shared modules? – alper May 19 '20 at 11:14
  • @alper then you don’t have thread specific logging perhaps. Then you have module-specific logging. Take a look at the [LogRecord attributes](https://docs.python.org/library/logging.html#logrecord-attributes) and see what combination of attributes you want to accept for each handler. Then adjust your filter accordingly. – Martijn Pieters May 19 '20 at 11:23
  • *I can filter logs, but on fly I cannot redirect them into different files.* //I think I have `module-specific` logging as I just do: import logging after doing `basicConfig` and do `logging.info(...)` on the modules that are used by the main thread and its generated threads. At the point, I am not sure how to let shared module to decide which logging filter(`ThreadFilter` or `IgnoreThreadsFilter`) it should use. I tried to generate a simple example: https://gist.github.com/avatar-lavventura/6a1ef52e552a8854d11df2a1e77f3759 – alper May 19 '20 at 14:12
  • @alper: logging configuration is complex. A handler is given log records from the logger object it is attached to, and everything downstream that propagates and passes the effective level. After that, you have to use filters. The moment you have *multiple loggers whose only common parent is the root*, you have to use filtering, basically. – Martijn Pieters May 19 '20 at 14:44
  • @alper: you attached filters to your *log objects* and not to your *handlers*. You also didn't give your thread a thread name, or configured your `ThreadFilter()` object to filter on a specific thread id or thread name. The code in my answer will not filter on anything if neither a thread name nor thread id has been set. I'll update it to disallow that. – Martijn Pieters May 19 '20 at 14:46
  • sorry for too many questions. I am able to filter logs but I was not able to write main thread' logs to different file than the threads'. `MainThread` and `child threads` should use different file-handlers than each other. I think during each `filtering` if possible, I need to change logger's filehandler. – alper May 19 '20 at 15:04
  • @alper: I just corrected a typo in the `IgnoreThreadsFilter` implementation; the `!=` should have been `==`. I'll create a fork of your gist with corrections. – Martijn Pieters May 19 '20 at 15:11
  • 1
    @alper: with the corrected `IgnoreThreadsFilter` from my answer, the code in [my version of your gist](https://gist.github.com/mjpieters/2c73b6f6ed31bc426a8786c0339c18bb) works as designed. The main reason your code didn't work is because you were adding the filters to the logger objects, not to the *handlers*. – Martijn Pieters May 19 '20 at 15:24
  • Thank you, I am able to make it work. I have a small question, assume if I have hunders of threads would it be a problem/overload to have hundreds of opened `fileHandlers`. Do they close themselves when thread is closed or should I close them manually. – alper May 19 '20 at 18:23
  • 1
    @alper: the handlers are not aware of the thread being closed, so yes, you'll have to close and remove those yourself. How much of a problem not closing them depends on what limits your OS sets on processes. If you run into an [*too many open files* IOError](https://stackoverflow.com/questions/18280612/ioerror-errno-24-too-many-open-files) then you do have too many open files and either should handle closing better or configure your OS to raise the limit. – Martijn Pieters May 19 '20 at 19:44
  • I am not sure but in python3, record has variable names `threadName` instead of `thread_name` that contains `Thread-1` when I add `super().__init__(thread_name)` into init(). Could I only check `record.thread != self._thread_id:` condition? @Martijn Pieters♦ – alper May 23 '20 at 13:55
  • @alper: record has an attribute `threadName` in any Python version; my answer uses that exact attribute: `if self._threadname is not None and record.threadName != self._threadname:`. The thread filter gives you the choice of filtering on either the thread name, or the thread id. Or both, if that suits your needs better. I'm not sure where this became an issue? – Martijn Pieters May 23 '20 at 14:04
  • Sorry I was working on old version it remained as `record.thread_name` , I updated as `record.threadName` and it works without an issue. // As final step, I am also trying to close handlers when the related threads being closed when they are completed, since there could be hundreds of thread I am not sure would my OS handle them or not. – alper May 23 '20 at 14:11
  • Also `pylint` gives following warning `W0231: __init__ method from base class 'Filter' is not called (super-init-not-called)` but I think it is not that important. – alper May 23 '20 at 14:21