21

I have a log file that is constantly growing. How can I watch and parse it via a Ruby script?

The script will parse each new line as it is written to the file and output something to the screen when the new line contains the string 'ERROR'

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
KingInk
  • 536
  • 1
  • 3
  • 7

7 Answers7

16
def watch_for(file, pattern)
  f = File.open(file,"r")
  f.seek(0,IO::SEEK_END)
  while true do
    select([f])
    line = f.gets
    puts "Found it! #{line}" if line=~pattern
  end
end

watch_for("g.txt",/ERROR/)

Thanks for the ezpz's idea, using the select method you get get what you want. The select method is listening the IO's stream, read the bytes what comes 'late'.

Qianjigui
  • 699
  • 5
  • 5
  • 5
    Note: `select` always returns immediately when used on a file stream: You can always read EOF from that file stream, so your ruby process ends up spinning while it waits for the file to update. Different operating systems tend to offer different tools for waiting on files, Linux has inotify and OS X has fsevents - there are convenient ruby gems that wrap them, too. – antifuchs May 18 '12 at 06:22
  • I think the reason it uses 100% CPU is because if you remove the line that says `if line=~pattern` it just keeps returning blank lines even if there is no new update. I'm not sure how to fix it myself, as I just stumbled across that issue as well. – FilBot3 Apr 03 '14 at 19:54
  • 3
    This is *INCORRECT* implementation, the select here DOES NOT block. It works because you read EOF which fails the comparison. – lzap Jan 21 '16 at 10:31
10

There are two approach:

  • poll the file in an infinite loop (like in Qianjigui's answer, but it is good to put some sleep inside the infinite loop)
  • use OS event subsystem: kqueue on BSD, inotify on Linux

Here is an article I wrote about this: Ruby for Admins: Reading Growing Files. So the program combining both event subsystem and polling looks like this:

def tail_dash_f(filename)
  open(filename) do |file|
    file.read          
    case RUBY_PLATFORM   # string with OS name, like "amd64-freebsd8"
    when /bsd/, /darwin/
      require 'rb-kqueue'
      queue = KQueue::Queue.new     
      queue.watch_file(filename, :extend) do
        yield file.read             
      end
      queue.run                     
    when /linux/
      require 'rb-inotify'
      queue = INotify::Notifier.new  
      queue.watch(filename, :modify) do
        yield file.read             
      end
      queue.run                      
    else
      loop do           
        changes = file.read
        unless changes.empty?  
          yield changes
        end
        sleep 1.0       
      end
    end
  end
end

tail_dash_f ARGV.first do |data|
  print data
  if data =~ /error/i
    # do something else, for example send an email to administrator
  end
end
Grych
  • 2,861
  • 13
  • 22
  • This is excellent. It should be noted for anyone who doesn't know how to get those gems that they can do "gem install rb-inotify" (as an example) and then add "require 'rubygems'" to the top of their script if it isn't there (and they aren't running rails) – David Ljung Madison Stellar Jan 28 '20 at 22:50
9

You can use Kernel#select in the following way:

def watch_for(file,pattern)
   f = File.open(file,"r")

   # Since this file exists and is growing, seek to the end of the most recent entry
   f.seek(0,IO::SEEK_END)

   while true
      select([f])
      puts "Found it!" if f.gets =~ pattern
   end
end

Then call it like:

watch_for("some_file", /ERROR/)

I've elided all error checking and such - you will want to have that and probably some mechanism to break out of the loop. But the basic idea is there.

Mike Campbell
  • 7,921
  • 2
  • 38
  • 51
ezpz
  • 11,767
  • 6
  • 38
  • 39
  • This solution uses 100% CPU because, in a file, your `select` will immediately return true on the end of the file because `read()` will read EOF (thanks to @antifuchs). – hagello Dec 07 '20 at 20:51
5

If you're on Linux...

tail -f log/development.log | grep "ERROR"

Unless you really wanted it to be a Ruby script for some reason.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
cakeforcerberus
  • 4,657
  • 6
  • 32
  • 42
  • Wrapping this in a script gives you the ability to react to an error in a more meaningful way than by noting that it occurred. grepping is good when you want to post-process, but it is rather limited in its ability to invoke dynamic behavior. – ezpz Aug 18 '09 at 14:09
  • 1
    "...output something to the screen when the new line contains the string 'ERROR'" is not dynamic behavior :) – cakeforcerberus Aug 18 '09 at 14:18
  • 1
    This works best for me. Meta-programming around the error log seems like effort better spent elsewhere. – MattC Aug 18 '09 at 14:21
4

check out file-tail gem

3

Poor man's approach for quick stuff:

  1. a Ruby script that does

    ARGF.each do |line|
      ...
    
  2. Running screen with:

    tail -f file | ruby script 
    
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
reto
  • 16,189
  • 7
  • 53
  • 67
3

Working on idea of @Qianjigui but not using 100% CPU:

def watch_for(file, pattern)
  # Replace -n0 with -n+1 if you want to read from the beginning of file
  f = IO.popen(%W[tail -f -n0 #{file}])
  while line = f.gets
    puts "Found it! #{line}" if line =~ pattern
  end
end

watch_for('g.txt', /ERROR/)
tig
  • 25,841
  • 10
  • 64
  • 96
  • This solution uses 100% CPU because, in a file, your `select` will immediately return true on the end of the file because `read()` will read EOF (thanks to @antifuchs). – hagello Dec 07 '20 at 20:50
  • @hagello That is why I read from `tail` pipe instead of file directly, so `tail -f` becomes responsible for waiting and will not return EOF (probably until error or interrupt). I've also rechecked the code and looks like `loop` and `select` are also not needed – tig Dec 11 '20 at 17:07