1

My supervisor wants me to fix the "concurrency problem in logs", by which he means that the log files we are producing have mixed timestamps at the beginning/end of different files. That is:

  • First log file has at the end :

    [03/Dec/2013:13:55:19]---------------------
    [03/Dec/2013:13:55:20]---------------------
    [03/Dec/2013:13:55:20]---------------------
    
  • Second file starts from:

    [03/Dec/2013:13:40:16]---------------------
    [03/Dec/2013:13:40:16]---------------------
    [03/Dec/2013:13:40:23]---------------------
    

We use rotational file handler and the second file should have timestamps starting when the first one ends, but it doesn't. How can I make the timestamps flush into logs in right order during rotation of files?

"Logger" class, which just uses Python logging module:

class logger:
    def __init__(self, logger_name='prod'):
        self.error_logger = logging.getLogger(logger_name+'_error')

    def error(self, msg='', level='error'):
        if msg:
            getattr(self.error_logger,level)(msg)

    def log(self, msg='', level='info'):
        if msg:
            getattr(self.error_logger,level)(msg)

Formatting of the logs:

class our_formatter(logging.Formatter):

def find_topmost_stack_frame(self):
    i = 0
    stack = []
    while True:
        try:
            fr = sys._getframe(i)
            if fr.f_code.co_name == '__call__':
                    break
            stack.append(fr)
        except:
            break
        i += 1
    return "%s:%s" % (stack[-4].f_code.co_filename, stack[-4].f_lineno)

def format(self, record):
    try:
        if record.done:
                return record.msg
    except:
        record.done = False

    rtime = time.strftime("%d/%b/%Y:%H:%M:%S", time.localtime(record.created))
    from tools.user_management import user_pack
    email = user_pack().get_email()

    if record.levelno > 20:
        if email:
            record.msg = '[%s][user:%s][%s] {%s} %s' % ( rtime, email, record.levelname, self.find_topmost_stack_frame(),
                                                         record.msg)
        else:
            record.msg = '[%s][%s] {%s} %s' % ( rtime, record.levelname, self.find_topmost_stack_frame(), record.msg)
    else:
        if email:
            record.msg = '[%s][user:%s][%s] %s' % ( rtime, email, record.levelname, record.msg)
        else:
            record.msg = '[%s][%s] %s' % ( rtime, record.levelname, record.msg)

    record.done = True
    return logging.Formatter.format(self, record)

And finally the configuration of logger:

log = cherrypy.log
log.error_file = None

maxBytes = getattr(log, "rot_maxBytes", 10000000)
backupCount = getattr(log, "rot_backupCount", 1000)
fname = getattr(log, "rot_error_file", "logs/error.log")

logger = logging.getLogger()
logger.setLevel(0)

# Make a new RotatingFileHandler for the error log.
h = logging.handlers.RotatingFileHandler(fname, 'a', maxBytes, backupCount)
h.setFormatter(rest_formatter())
log.error_log.addHandler(h)

# set up custom ReST logger
logger = logging.getLogger("rest_error")
logger.addHandler(h)

# set up our custom logger
ha = logging.handlers.RotatingFileHandler(fname, 'a', maxBytes, backupCount)
ha.setFormatter(our_formatter())
logger = logging.getLogger("prod_error")
logger.addHandler(ha)

The application is multithreaded, however the built-in logging should thread-safe (I was reading some parts of its code today and it surely has some use of locks).

The problem is only between beginning of one file and end of the previous one (not in the middle), so I think it's some case of reservation of file space by logger, however I understand it should still keep the right order, as there should be only one instance of the logger per file handler specified.

We have A LOT of logging. By a lot I mean there are often 10+ logs every second.

p4r4noj4
  • 93
  • 1
  • 9
  • 1
    We need to see some of your code, to even begin understanding how to fix your problem. – William Denman Dec 13 '13 at 17:05
  • How the logs are setup. Are there multiple processes/threads? – jfs Dec 13 '13 at 18:31
  • why do different handlers write to the same file? – jfs Dec 15 '13 at 23:59
  • @J.F.Sebastian I think that it's because we didn't want to have too many types of log files. Anyway - cherrypy itself is not logging almost anything to log files (only the startup/restart sequences). And the rest_error - I have not seen it being used anywhere after initialization. – p4r4noj4 Dec 16 '13 at 10:35
  • @J.F.Sebastian - correction, I just checked and actually the rest logger is used as well. However the problem with those not-well-ordered timestamps happens sometimes with logs only from rest_logger. I wonder if it might be some kind of problem caused by cherrypy thread dispatching when receiving requests. – p4r4noj4 Dec 16 '13 at 11:00
  • Try to attach the rotating handler to the root logger and remove all other handlers that access the same file. – jfs Dec 16 '13 at 11:08
  • @J.F.Sebastian So far it seems to have fixed the problem! I guess the additional handlers didn't work well together in multithreaded environment. Add you comment as answer so I can tick it, please! – p4r4noj4 Dec 18 '13 at 14:42
  • @p4r4noj4: if tests show that it works; you can [post your own answer](http://stackoverflow.com/help/self-answer). You've done all the work. – jfs Dec 19 '13 at 02:40

2 Answers2

3

The problem laid in multiple handlers for single file. Loggers used different handlers and so they were trying to write in the same file at the same time, occasionally causing a new file to be created (and then they were writing to two distinguish files for some period of time).

Removing the "ha" handler seems to have fix the problem!

p4r4noj4
  • 93
  • 1
  • 9
0

Reading your question, it seems to me that you need to flush the writing buffer.

To oblige the system to write something in a file without waiting it (the system) triggers by itself the writing of the buffer's content, it's like that:

from os import fsync
with open(filename,'a') as fh:
    fh.write(something)
    fh.flush()
    fsync(fh.fileno())
    # continued code

It's a necessary procedure when something is a tiny quantity to write.

However, when a file is closed, normally the writing buffer is emptied and it's content is written in the file before it is entirely closed.
Si, I don't know if this answer really brings something usefull for you.

eyquem
  • 26,771
  • 7
  • 38
  • 46
  • `logging` module calls `flush()` itself. Anyway rotated logfile is closed that along should flush buffers. `fsync` might help to ensure that data is written if a process crashes. See discussion in [Threadsafe and fault-tolerant file writes](http://stackoverflow.com/q/12003805/4279). But I don't see how it is related to "concurrency problem in logs". – jfs Dec 13 '13 at 18:28
  • I thought that a subsequent file could be opened while the former one could remain unclosed a certain time having not been written with the timestamps. – eyquem Dec 13 '13 at 18:50
  • My supervisor told me that actually when he looks at the logs there are two of them being written at the same time when one is almost finished. So the logger seems to flush its buffer to two files at the same time, which does seem rather weird. – p4r4noj4 Dec 13 '13 at 22:28