1

I have a Red Hat Enterprise Linux system running several Python processes. Each process is writing to the same log file via standard Python WatchedFileHandler. Together, they write several dozens of entries per second. Average entry length is something like 200 bytes, occasionally longer.

I believe I am supposed to get mixed up (interleaved) entries in that file. But I don’t seem to find any. Why? Does the OS guarantee something for data below a threshold length? I can find mentions of that (PIPE_BUF) for pipes and FIFOs, but not regular files. Or is the race condition simply too narrow for my load on a typical system?

Vasiliy Faronov
  • 11,840
  • 2
  • 38
  • 49

2 Answers2

2

UPDATE:

While logging is safe to use in multithreading environment:

The logging module is intended to be thread-safe without any special work needing to be done by its clients. It achieves this though using threading locks; there is one lock to serialize access to the module’s shared data, and each handler also creates a lock to serialize access to its underlying I/O.

http://docs.python.org/2/library/logging.html#thread-safety

it appears to be risky in multiprocessing environment since it doesn't protect the logging file with file locks (all it uses for protection is a bunch of mutexes). See the source code of FileHandler class (used by WatchedFileHandler). To guaranty writes atomicity you'll have to protect your log file with file locks yourself or you can use something like ConcurrentLogHandler

So you was just lucky enough to avoid data overlaps in your log file. Usually operating systems don't provide any guaranties about writes atomicity, moreover different versions of libc may have different default values of buffer size.

You can get default stream buffer size on your system using the following C code

#include <stdio.h>

int main(void)
{
    printf("%d\n", BUFSIZ);
    return 0;
}
Dan Kruchinin
  • 2,945
  • 1
  • 17
  • 21
  • Please re-read my question. I have several processes, not several threads in one process. – Vasiliy Faronov Dec 17 '13 at 13:11
  • @VasiliyFaronov From a brief look at the source code I can conclude that I was wrong, logging doesn't use file locks => it's not safe in multiprocess environment. So you was just lucky enough to avoid data overlaps. In general OSes don't guaranty you "writes atomicity", each libc may have different default values for internal buffer sizes (afaik, there's no any standard on this). I guess you'll have to protect your files with file locks yourself. – Dan Kruchinin Dec 17 '13 at 13:21
  • You can post that as an answer and I’ll accept it if nobody else weighs in. – Vasiliy Faronov Dec 17 '13 at 13:37
  • But it would be nice if you could also mention the “internal buffer sizes” for a typical GNU libc, and what guarantees (if any) these sizes imply. – Vasiliy Faronov Dec 17 '13 at 13:39
  • Can you clarify how the “stream buffer size” relates to my question? – Vasiliy Faronov Dec 17 '13 at 14:17
  • @VasiliyFaronov libc flushes stream's buffer when it overflows. I assumed that granularity of your writes is BUFSIZ, nut 200b, but according to nmenezes's answer above, WatchedFileHandler does pretty ugly hack explicitly flushing stream's buffer and reopening the log file if it was modified by other process. It seems to be the thing that significantly reduces data overlapping (note, that it's still not atomic => it doesn't save you from overlapping completely) – Dan Kruchinin Dec 17 '13 at 14:50
  • There is no such thing in `WatchedFileHandler`. It only [looks for changes in `st_dev` and `st_ino`](http://hg.python.org/cpython/file/tip/Lib/logging/handlers.py#l462). Anyway, so the libc buffer has nothing to do with inter-process atomicity? That is, when libc flushes the buffer, its contents can still get interleaved with another process’, right? – Vasiliy Faronov Dec 17 '13 at 14:52
2

From the docs:

http://docs.python.org/2.7/library/logging.handlers.html#watchedfilehandler

The WatchedFileHandler class, located in the logging.handlers module, is a FileHandler which watches the file it is logging to. If the file changes, it is closed and reopened using the file name.

The WatchedFileHandler works on Linux, but not on Windows (because it does not have inodes). It monitors changes on the file inode. If anything is changed on that file, it is closed and opened again.

Depending on how the WatchedFileHandler is implemented, I think they repeat this process until it can safely write to the file, but that would probably mean that your processes are in constant open/close battle for the log file.

Correction: it tries once to close and then it reopens the file again. I'm looking for another explanation about why you don't have problems with it.

So far, I did not find any reason to explain why you don't have intermixed lines in your logs. The WatchedFileHandler was designed to be used by one application (controlling multithread access using a RLock before emit, with a valuable exception to support external changes like log rotation). But looking the source code of its super classes, I did not find any reason. An unchecked theory, is that:

a) As the log is always opened in append mode

b) As it checks once, before writing, if anything changed

c) You always get a complete line because your lines are smaller than the writing buffer. So you start from the end of the last line, write a new line and the process repeats again and again. If two process write at the same time, they will start to write their new lines from the end of the last line.

Try to look for anomalies on the log lines, like missing lines... if the size of the log is very regular, it can be difficult to check a concurrency problem. I would look for anomalies on the last line of the log too.

How many processors your system has?

Test program to break the log:

import logging
import logging.handlers
import sys

FORMAT = "%(asctime)-15s %(process)5d %(message)s"

logger = logging.getLogger(__name__)
wfh = logging.handlers.WatchedFileHandler("test.log", "a")
wfh.setFormatter(logging.Formatter(FORMAT))
logger.addHandler(wfh)
logger.setLevel(logging.DEBUG)

n = int(sys.argv[1])

for x in range(1000):
        logger.info("%6d %s" % (x, str(n)*(4096+n)))

Run with a simple shell script:

python test.py 1 &
python test.py 2 &
python test.py 3 &
python test.py 4 &
python test.py 5 &
python test.py 6 &
python test.py 7 &
python test.py 8 &
python test.py 9 &

Run on a CentOS 6.3 with 4 processors, I start to see anomalies on the lines. You can change the size of the line in the program, small lines look to be ok, but large lines are not.

You can test with:

 grep -v ^2013 test.log

If the log is ok, all lines would start with 2013. The grep will not list mixed lines, but I could find them easily with less.

So, it works for you because your lines are smaller than your writing buffer or because your processes don't write to the log very often, but even them, there is no guarantee that it will works all the time.

nmenezes
  • 910
  • 6
  • 12
  • I’m pretty sure that this refers to the changes in file metadata, specifically the inode, which happens on rotation. It would be very strange to reopen the log file after every write; nor does it guarantee atomicity. I’m afraid your answer is not helpful without additional references. – Vasiliy Faronov Dec 17 '13 at 14:39
  • In fact, you can see it in the source code: [`Lib/logging/handlers.py:462`](http://hg.python.org/cpython/file/tip/Lib/logging/handlers.py#l462). – Vasiliy Faronov Dec 17 '13 at 14:42
  • I think your append mode reference is a step in the right direction. See discussion of `O_APPEND` here: http://stackoverflow.com/questions/10650861/atomicity-of-write2-to-a-local-filesystem – Vasiliy Faronov Dec 17 '13 at 15:10
  • I still don’t have a clear picture, but I’m accepting this answer because it is helpful and sounds close to the truth, thank you. – Vasiliy Faronov Dec 19 '13 at 12:29