First I would like to answer the technical aspect of your question:
A WatchEvent
just gives you the file name of a changed (or created or deleted) file and nothing more. So if you need any logic beyond this you have to be implement it on your own (or use an existing library of course).
If you want to read only new lines, you have to remember the position for each file and whenever this file is changed you can move to the last known position. To get the current position you could use a CountingInputStream
from the Commons IO package (credits go to [1]). To jump to the last position, you can use the function skip
.
But you are using a GZIPInputStream
, this means that skip will not give you a great performance boost since skipping a compressed stream is not possible. Instead GZIPInputStream skip will uncompress the stream as it would when you are reading it so you will experience only little performance improvements (try it!).
What I don't understand is why you are using compressed log files at all? Why don't you write uncompressed logs with a DailyRollingFileAppender
and compress it at the end of the day, when the application doesn't access it anymore?
Another solution could be to keep the GZIPInputStream
(store it) so that you don't have to reread the file again. It may depend on how many log files you have to watch to decide if this is reasonable.
Now some questions on your requirements:
You didn't mention the reason why you want to watch the log files in real time. Why don't you centralize your logs (see Centralised Java Logging)? For example take a look on logstash and this presentation (see [2] and [3]) or on scribe or on splunk, which is commercial (see [4]).
A centralized log would give you the opportunity to really have real time reactions based on your log data.
[1] https://stackoverflow.com/a/240740/734687
[2] Using elasticsearch, logstash & kibana to create realtime dashboards - slides
[3] Using elasticsearch, logstash & kibana to create realtime dashboards - video
[4] Log Aggregation with Splunk - slides
Update
First, a Groovy script to generate a zipped log file. I start this script from GroovyConsole each time I want to simulate a log file change:
// Run with GroovyConsole each time you want new entries
def file = new File('D:\\Projekte\\watcher_service\\data\\log.gz')
// reading previous content since append is not possible
def content
if (file.exists()) {
def inStream = new java.util.zip.GZIPInputStream(file.newInputStream())
content = inStream.readLines()
}
// writing previous content and append new data
def random = new java.util.Random()
def lineCount = random.nextInt(30) + 1
def outStream = new java.util.zip.GZIPOutputStream(file.newOutputStream())
outStream.withWriter('UTF-8') { writer ->
if (content) {
content.each { writer << "$it\n" }
}
(1 .. lineCount).each {
writer.write "Writing line $it/$lineCount\n"
}
writer.write '---Finished---\n'
writer.flush()
writer.close()
}
println "Wrote ${lineCount + 1} lines."
Then the logfile reader:
import java.nio.file.FileSystems
import java.nio.file.Files
import java.nio.file.Path
import java.nio.file.Paths
import java.nio.file.StandardOpenOption
import java.util.zip.GZIPInputStream
import org.apache.commons.io.input.CountingInputStream
import static java.nio.file.StandardWatchEventKinds.*
class LogReader
{
private final Path dir = Paths.get('D:\\Projekte\\watcher_service\\data\\')
private watcher
private positionMap = [:]
long lineCount = 0
static void main(def args)
{
new LogReader().processEvents()
}
LogReader()
{
watcher = FileSystems.getDefault().newWatchService()
dir.register(watcher, ENTRY_CREATE, ENTRY_DELETE, ENTRY_MODIFY)
}
void processEvents()
{
def key = watcher.take()
boolean doLeave = false
while ((key != null) && (doLeave == false))
{
key.pollEvents().each { event ->
def kind = event.kind()
Path name = event.context()
println "Event received $kind: $name"
if (kind == ENTRY_MODIFY) {
// use position from the map, if entry is not there use default value 0
processChange(name, positionMap.get(name.toString(), 0))
}
else if (kind == ENTRY_CREATE) {
processChange(name, 0)
}
else {
doLeave = true
return
}
}
key.reset()
key = watcher.take()
}
}
private void processChange(Path name, long position)
{
// open file and go to last position
Path absolutePath = dir.resolve(name)
def countingStream =
new CountingInputStream(
new GZIPInputStream(
Files.newInputStream(absolutePath, StandardOpenOption.READ)))
position = countingStream.skip(position)
println "Moving to position $position"
// processing each new line
// at the first start all lines are read
int newLineCount = 0
countingStream.withReader('UTF-8') { reader ->
reader.eachLine { line ->
println "${++lineCount}: $line"
++newLineCount
}
}
println "${++lineCount}: $newLineCount new lines +++Finished+++"
// store new position in map
positionMap[name.toString()] = countingStream.count
println "Storing new position $countingStream.count"
countingStream.close()
}
}
In the function processChange
you can see 1) the creation of the inputstreams. The line with the .withReader
creates the InputStreamReader
and the BufferedReader
. I use always Grovvy, it is Java on stereoids and when you start using it, you cannot stop. A Java developer should be able to read it, but if you have questions just comment.