15

I'm working on a Ruby library that needs to do logging. Ideally, I'd like multiple worker processes to be able to log into the same file. Looking at the source for the Logger class from Ruby's standard library, I see that efforts are made to synchronize writes into the log from multiple threads (as pointed out in the answer to Is Ruby's stdlib Logger class thread-safe?).

It seems there's a similar problem when multiple processes are writing into the same log file: depending on how the underlying layers decide to buffer / split the writes, each log message may not maintain its integrity.

So, is there a way of using the standard Logger class to allow multiple processes to safely log to a single file? If not, how is this typically accomplished in Ruby projects?

Here's what I mean by 'safely':

  1. Each log line is 'atomic' - appearing in its entirety, uninterrupted before the next message begins. e.g. nothing like [1/1/2013 00:00:00] (PID N) LOGMESS[1/1/2013 00:00:01] (PID M) LOGMESSAGE2\nAGE1
  2. Log messages need not be strictly ordered across processes, so long as the timestamps appearing in the log are correct.

Update:

I decided to take the Tin Man's advice and write a test, which you can find here: https://gist.github.com/4370423

The short version: Winfield is correct, at least with the default usage of Logger it is safe to use from multiple processes simultaneously (for the definition of 'safe' given above).

The key factor seems to be that if given a file path (instead of an already-open IO object) Logger will open the file with mode WRONLY|APPEND, and set sync=true on it. The combination of these two things (at least on my testing on Mac OS X) appears to make it safe to log concurrently from multiple processes. If you want to pass in an already-open IO object, just make sure you create it in the same way.

Community
  • 1
  • 1
grumbler
  • 886
  • 9
  • 20
  • Define "safely". Do you need to log single lines, or are they groups of lines that need to be kept together? – tadman Dec 17 '12 at 06:51
  • @tadman - nope, just single lines, as noted in the question. – grumbler Dec 17 '12 at 15:41
  • @theTinMan - in my experience, it's not always easy to reproduce concurrency-related issues like this in testing, but I suppose it can't hurt to try. – grumbler Dec 17 '12 at 15:43
  • 1
    2-core/2-thread machines are less likely to have threading issues than a 4-core/8-thread machine, so if you have access to one like that, the easier it will be to reproduce any potential threading problems. For instance, aggressively writing 80-character lines into your logging system and then reading it back, verifying the correct line count and line lengths, would help build confidence that the locking is working correctly. – tadman Dec 17 '12 at 16:19
  • @tadman, correct. Years ago I wrote a multi-threaded spider in Perl to process a huge Fortune 50 site for analytics -- it involved spidering over 20 separate sites with thousands of pages. I had over 50 spiders running at once, and, yeah, debugging was tedious. Concurrency was an issue, and there were many hours spent in the debugger single stepping single instances, then two-instances, then ten and watching logs. It worked though. (And we won the contract as a result. :-) ) Your idea of reading the log output is good; Set up tests that write identical text then watch for collisions/overlaps. – the Tin Man Dec 17 '12 at 18:06

1 Answers1

11

Yes, you can safely write interleaved log data to a single logfile in the way you've described.

However, you're better off logging a separate log for each process or using a consolidated logging system like syslog. Here's a couple of reasons why:

  • Log Rotation/Management: truncating/rolling logfiles is difficult when you have to coordinate signalling multiple processes
  • Interleaved data can be confusing even if you're injecting the PID to disambiguate

I'm currently managing a number of Resque workers per system with a single logfile and wish I had separated a logfile for each worker. It's been difficult debugging issues and managing the logs properly.

Winfield
  • 18,985
  • 3
  • 52
  • 65
  • Thanks! It seems you're correct, and it is safe by the criteria above. Your points about the desirability of doing this are well taken, however in this particular case I've been asked explicitly to produce a single log file. See my update to the question above for a test demonstrating this result. – grumbler Dec 24 '12 at 19:34