7

I use the following code to track ssh log-ins:

def follow(thefile):
  thefile.seek(0,2)
  while True:
    line = thefile.readline()
    if not line:
      time.sleep(0.1)
      continue
    yield line

if __name__ == '__main__':
  logfile = open('/var/log/auth.log', 'r')
  loglines = follow(logfile)
  for line in loglines:
    print 'do something here'

I've noticed that this script suddenly stops working after a couple of days. I don't get any error, it doesn't terminate, it just stops working, as if readline() would never return.

So I executed a echo 'test' >> auth.log.1 and this indeed ends up getting processed by the script, because sometime ago auth.log got renamed to auth.log.1

How can I track when such a log rotation happens and adjust accordingly?

codeforester
  • 39,467
  • 16
  • 112
  • 140
Daniel F
  • 13,684
  • 11
  • 87
  • 116
  • Ah. My apologies. I admit I didn't read the question thoroughly before flagging. I am chastising myself now. – SiHa Jun 07 '17 at 11:56

5 Answers5

6

Using e4c5's answer I ended up with this code, which also solves the issue of calling readline() multiple times per second.

During the first invocation it skips to the end of the file and waits for modifications. When the file is moved, it reopens the file and reads the entire content, then starts to wait.

import os
import time
import traceback
import threading
import inotify.adapters

logfile = b'/var/log/auth.log'
#logfile = b'logfile.log'

##################################################################

def process(line, history=False):
  if history:
    print '=', line.strip('\n')
  else:
    print '>', line.strip('\n')

##################################################################

from_beginning = False
notifier = inotify.adapters.Inotify()
while True:
  try:
    #------------------------- check
    if not os.path.exists(logfile):
      print 'logfile does not exist'
      time.sleep(1)
      continue
    print 'opening and starting to watch', logfile
    #------------------------- open
    file = open(logfile, 'r')
    if from_beginning:
      for line in file.readlines():
        process(line, history=True)
    else:
      file.seek(0,2)
      from_beginning = True
    #------------------------- watch
    notifier.add_watch(logfile)
    try:
      for event in notifier.event_gen():
        if event is not None:
          (header, type_names, watch_path, filename) = event
          if set(type_names) & set(['IN_MOVE_SELF']): # moved
            print 'logfile moved'
            notifier.remove_watch(logfile)
            file.close()
            time.sleep(1)
            break
          elif set(type_names) & set(['IN_MODIFY']): # modified
            for line in file.readlines():
              process(line, history=False)
    except (KeyboardInterrupt, SystemExit):
      raise
    except:
      notifier.remove_watch(logfile)
      file.close()
      time.sleep(1)
    #-------------------------
  except (KeyboardInterrupt, SystemExit):
    break
  except inotify.calls.InotifyError:
    time.sleep(1)
  except IOError:
    time.sleep(1)
  except:
    traceback.print_exc()
    time.sleep(1)

##################################################################
Daniel F
  • 13,684
  • 11
  • 87
  • 116
3

This is best done with inotify you don't want to keep polling the file system to ask if things have changed during each iteration of the loop. That's a lot of wasted IO. inotify will notify you when a change occurs. There is an example right from the manual which shows it's usage with log file.

e4c5
  • 52,766
  • 11
  • 101
  • 134
2

you can have a look at the inode, of the file.

import os
inode = os.stat('/var/log/auth.log').st_ino

When the inode changes, the file has been rotated.

olisch
  • 960
  • 6
  • 11
  • I was thinking about `os.stat('/var/log/auth.log').st_size` and act accordingly when size shrinks. So inode changes when the file is renamed? Would I need to do this in a thread which checks every couple of seconds? How would I then tell the script to stop waiting in `readline()`? – Daniel F Jun 07 '17 at 09:04
  • @DanielF no the inode of the original file won't change when the file is renamed. But if you always watch for the inode of /var/log/auth.log inside your while loop, and you recognize a file change, you would close your file handle and reopen it on the new auth.log. Instead of opening the file in `__main__`, you should do that as well in your `follow` function. – olisch Jun 07 '17 at 09:12
  • Oh, right. While checking `inotify` I believe that a combination of `inotify` and your suggestion is the best solution, so I don't poll but get to know when the inode changes, since inotify aparently doesn't show me that the file got renamed but just that a file attribute changed. – Daniel F Jun 07 '17 at 09:18
1

Apparently, I can't comment until I have >= 50 reputation.

@daniel-f has a GREAT example! Only edge case I ran into is that, when the service creating the rotating log files I am reading restarts, it deletes the old files and creates new ones.

This causes the 'notifier' to lose visibility into the log file (since it is different).

Since the service writes to the log file every 60 seconds, I did a quick modification to the for loop, shown below:

last_pull = datetime.datetime.now()
while True:
...
...

            for event in notifier.event_gen():
                if event is not None:
                    last_pull = datetime.datetime.now()
                    (header, type_names, watch_path, filename) = event
                    if set(type_names) & set(['IN_MOVE_SELF']): # moved
                        notifier.remove_watch(file_watcher.intput_logfile)
                        file.close()
                        time.sleep(1)
                        break
                    elif set(type_names) & set(['IN_MODIFY']): # modified
                        lines = file.readlines()
                        for line in lines:
                            process(line, file_watcher, history=False)
                else:
                    if (datetime.datetime.now() - last_pull).total_seconds() >= time_to_refresh:
                        last_pull = datetime.datetime.now()
                        notifier.remove_watch(file_watcher.intput_logfile)
                        file.close()
                        break

This re-watches the file after 75 seconds without an update.

MSven
  • 36
  • 3
  • I just issued a `sudo service ssh restart` to check, but it continued using the same logfile. I even see `sshd[1226]: Received signal 15; terminating.` followed by a `sshd[29099]: Server listening on 0.0.0.0 port 22.` in the same log. I've been using this for months now and had no issues. I'll add a `IN_DELETE_SELF` and others and have them get logged to see if they occur. – Daniel F Aug 27 '18 at 16:37
  • I've been thinking more about this, and I doubt that a restart of `sshd` will cause a new log to be created. Logs are there for troubleshooting and forensics, so deleting it will most likely not occur. A deletion only happens in a "non-erase-context" when it is moved to another filesystem (copy+delete). Only log rotation should be occurring (point to a new file, rename the old one). I'll start reading a bit about this when I find time for it. – Daniel F Aug 27 '18 at 18:40
  • Also it would not be an infinite loop, it would simply not get out of the blocking `notifier.event_gen()` since no new events would come in. Which is what caused my problem in the first place. I'm upvoting your answer for two reasons: It's really helpful input and brings this account of yours closer to the 50 reputation points. – Daniel F Aug 27 '18 at 18:44
  • To clarify the situation I was in. – MSven Aug 27 '18 at 21:16
  • Thanks for the note. Made me realize the issue. The service that was writing log files deletes the old files and creates new ones. This caused the notifier to essentially lose its watch over the file. Instead of closing and breaking, I just need to add a new watch to the file if it's been > the normal interval the file is modified (e.g. logFile is written to ever 1 min, so I wait 75 seconds to see if nothing has been read). Edited code to identify the better way (pending your thoughts) – MSven Aug 27 '18 at 21:27
  • Oh, I see YOU just made me realize the issue. I was dealing with a special case and not thinking beyond that. I'm using this on the `sshd` log, and never tried it with other types of log files. It didn't even come to my mind that other logs would be analyzed and that those would behave differently. I will definitely deal with this when I find some time. What log are you analyzing so that I can check with that? – Daniel F Aug 27 '18 at 23:23
  • It's a custom script used to pull netflow data from a Cisco device (https://github.com/CiscoDevNet/stealthwatch-data-exporter). The main thing is that all files are deleted and new ones are initialized on service restart. Seems like the adjusted code works for my situation. It's just a decision of starting from the beginning to ensure you got all of the data or starting from appends. – MSven Aug 27 '18 at 23:56
1

How to detect log file rotation by looking for a changed inode number

I really like the simplicity of @olisch's answer: just detect when the log file is rotated by detecting when the inode number changes.

This works because if you are logging into ~/mylog.log, log rotation will periodically rename this file to ~/mylog.log.1, then to ~/mylog.log.2, then to ~/mylog.log.3, etc. Each time it does this sequential renaming of the files to perform log rotation, it will create a brand new file at ~/mylog.log, which is the active log file, making the inode number of the file at that path change.

This polling-in-a-loop approach especially makes sense if you are already logging at a fixed time interval via a loop anyway!

So, here is a full demo (excluding the logging portion) to show how to detect a changed inode number of a log file at a given path:

import os
import pathlib


# Log into ~/mylog.log. Log rotation will periodically rename this file to 
# ~/mylog.log.1, then ~/mylog.log.2, ~/mylog.log.3, etc. Each time it does this,
# it will create a brand new file at ~/mylog.log, making the inode number of 
# the file at that path change.
log_file_path = os.path.join(pathlib.Path.home(), 'mylog.log')

# this is your main logging loop
inode_number_old = os.stat(log_file_path).st_ino
while True:

    # Detect log rotation events
    inode_number_new = os.stat(log_file_path).st_ino
    if inode_number_old != inode_number_new:
        print("Log file rotation just detected!")
        inode_number_old = inode_number_new

        # Now do whatever it is you want to do whenever a log rotation is
        # detected. Ex: change the new log file's permissions to read/write
        # for everyone.
        PERMISSIONS_EVERYONE_READ_WRITE = 0o666
        os.chmod(log_file_path, PERMISSIONS_EVERYONE_READ_WRITE)

I use the above technique in my cpu_logger.py program in my eRCaGuy_dotfiles repo. I use that logging script to continually log my CPU usage and detect which programs are taking all my CPU and locking up my computer.

Note that you can also read and verify the inode numbers of files at the command-line in Linux with:

ls -i

For an easier-to-look-at single-column output, use:

ls -i1

References

  1. @olisch's answer
  2. https://docs.python.org/3/library/os.html#os.stat_result.st_ino - official Python documentation
  3. Python logger file permissions
  4. My answer: Ask Ubuntu: How to log CPU load?
Gabriel Staples
  • 36,492
  • 15
  • 194
  • 265