I'm working on a Python script to parse Squid(http://www.squid-cache.org/) log files. While the logs are rotated every day to stop them getting to big, they do reach between 40-90MB by the end of each day.
Essentially what I'm doing is reading the file line by line, parsing out the data I need(IP, Requested URL, Time) and adding it to an sqlite database. However this seems to be taking a very long time(It's been running over 20 minutes now)
So obviously, re-reading the file can't be done. What I would like to do is read the file and then detect all new lines written. Or even better, at the start of the day the script will simply read the data in real time as it is added so there will never be any long processing times.
How would I go about doing this?