I'm interested in building a python script that can give me stats on how many lines per interval (maybe minute) are being written to a file. I have files that are being written as data comes in, a new line for each user the passes data through the external program. Knowing how many lines per x gives me a metric that I can use for future expansion planning. The output file(s) consist of lines, all relatively the same length and all with line returns at the end. I was thinking of writing a script that did something like: measures the length of the file at a specific point and then measures it again at another point in the future, subtract the two and get my result... however I don't know if this is ideal since it takes time to measure the length of the file and that may skew my results. Does anyone have any other ideas?
based on what people are saying I threw this together to start:
import os
import subprocess
import time
from daemon import runner
#import daemon
inputfilename="/home/data/testdata.txt"
class App():
def __init__(self):
self.stdin_path = '/dev/null'
self.stdout_path = '/dev/tty'
self.stderr_path = '/dev/tty'
self.pidfile_path = '/tmp/count.pid'
self.pidfile_timeout = 5
def run(self):
while True:
count = 0
FILEIN = open(inputfilename, 'rb')
while 1:
buffer = FILEIN.read(8192*1024)
if not buffer: break
count += buffer.count('\n')
FILEIN.close( )
print count
# set the sleep time for repeated action here:
time.sleep(60)
app = App()
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()
It does the job of getting the count every 60 seconds and printing it out to the screen, my next step is the math I guess.
One more edit: I've added the output of the count in one minute intervals:
import os
import subprocess
import time
from daemon import runner
#import daemon
inputfilename="/home/data/testdata.txt"
class App():
def __init__(self):
self.stdin_path = '/dev/null'
self.stdout_path = '/dev/tty'
self.stderr_path = '/dev/tty'
self.pidfile_path = '/tmp/twitter_counter.pid'
self.pidfile_timeout = 5
def run(self):
counter1 = 0
while True:
count = 0
FILEIN = open(inputfilename, 'rb')
while 1:
buffer = FILEIN.read(8192*1024)
if not buffer: break
count += buffer.count('\n')
FILEIN.close( )
print count - counter1
counter1 = count
# set the sleep time for repeated action here:
time.sleep(60)
app = App()
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()